S Sacrifice There is no scholarly consensus regarding the meaning of the term ‘sacrifice.’ In its widest usage the term, w...
70 downloads
3599 Views
17MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
S Sacrifice There is no scholarly consensus regarding the meaning of the term ‘sacrifice.’ In its widest usage the term, which derives from the Latin sacrificium, is applied to virtually any form of gift-giving to a deity. A more restricted application limits its use to situations in which the gift is mutilated or destroyed. An even more restrictive usage is suggested by the Oxford English Dictionary (Simpson and Weiner 1989\1933), whose primary definition is ‘the slaughter of an animal (often including the subsequent consumption of it by fire) as an offering to God or a deity.’ The term sacrifice is, therefore, best regarded as a polythetic category which includes heterogeneous modes of behavior motivated by intentions of variable kinds. These include commemoration, initiation, expiation, propitiation, establishing a covenant, or affecting a transition in social status. Regarding sacrifice not as constituting a class of phenomena having a fixed set of features in common but as a multivocal term applying to phenomena that merely have ‘family resemblances’ in common has the advantage of mitigating somewhat the Euro-centric bias inherent in the work of many of the most influential thinkers on the subject, among them Robertson Smith (1894\1927), Hubert and Mauss (1898), and Durkheim (1912). ‘The Sacrifice’ for this generation of scholars was the central feature of religion, a Judaic– Christian bias still evident as late as 1956, the year in which E. E. Evans-Pritchard published his classic, Nuer Religion. Tellingly, Evans-Pritchard did not identify any word in the Nuer language that unambiguously corresponded to the English ‘sacrifice.’ This lack of correspondence appears widespread. Benveniste (1969, p. 223), for example, could discover no common word for sacrifice in the Indo-European languages.
1. The Terminology of Sacrifice In more elaborate forms than straightforward ritualized gift-giving, sacrifice typically involves a set of four terms. First, there is the sacrificer, the executor of the ritual. Second, there is the sacrifier, the beneficiary of the sacrifice. This may be the executor of the ritual, individual members of the community, or the community in toto. Third, there is the object sacrificed. Although this gift may be anything, ethnographic
descriptions usually emphasize the bloody sacrifice of a living being, in which case blood typically is described as connoting such ideas as fertility, renewal, replenishment,vitality, life force, and empowerment. Access by the sacrifier to these desirables is acquired when this substance is shed in ritual by the act of forcing life out of the victim. Substitution, or the use of a surrogate in cases when the offering is that of a human being, is a common occurrence, so frequently resorted to that some writers, including two who have made important contributions to the topic— de Heusch (1986) and Bloch (1992, p. 32)—consider substitution to be a key feature of sacrificial rituals. Fourth, there is the deity to whom the sacrifice is made.
2. Approaches to the Sacrifice A number of distinguished thinkers have advanced what might be termed ‘approaches’ to the study of sacrifice, which afford insightful, though overlapping, perspectives. Among the most useful are the following five.
2.1 Sacrifice as Gift For Tylor (1871\1913), following Plato, the sacrifice was a gift made to an anthropomorphic divinity, much as a gift might be given to a chief, whose acceptance made the deity indebted to humanity. This approach was popular with nineteenth century scholars, who attributed to the act of sacrificing such motives as expiating sins, obtaining benefits, nourishing the deity, and establishing good relations between human beings and spirits. This theory lays stress on the beliefs of the givers rather than on the ritual performance itself, which in this interpretation is relegated to logically subservient status. This approach is at the same time too broad and too narrow. It is overly broad in that these motives undoubtedly are widespread in many rituals of sacrifice all over the world, yet excessively narrow since sacrificial rituals are employed to do much more than simply offer gifts.
2.2 Sacrifice as Communion Robertson Smith (1894\1927), in contrast to Tylor, granted priority to ritual, a ranking modern ethno13439
Sacrifice graphic fieldwork tends to validate. Robertson Smith’s main point was that early Semitic sacrifice, and by implication sacrifice among nonliterate peoples, was a feast human beings communally shared with their deity. He understood the immolated victim to be the clan totem, an animal that shared in the blood of its human kin. It was also the clan ancestor, which was the clan deity, so in consuming their animal victim members of the community were eating their deity. This act of consumption conferred spiritual power. Later research in the field undermined much of Robertson Smith’s argument, but unquestionably the sacrifice does offer a means by which human beings may conjoin with their deity and establish a relationship, and here Robertson Smith did have insight. He strongly influenced Hubert and Mauss, whose Essai sur la nature et la fonction du sacrifice, which appeared in 1898, remains the basic work on the topic. Like all the studies of the time, it depended entirely upon secondary and tertiary sources rather than original ethnography, and the authors further restricted their sources to Sanskrit and Hebrew texts. To them the most significant feature about the sacrificial feast was that it provided the mechanism by which communion, in the sense of communication between human beings and deity, could be brought about. It was made possible by the fact that the victim represented both sacrifier and divinity, thus breaking down the barrier between the sacred and the profane (a moral contrast that played a major role in their approach). Yet, although the sacrifical act conjoined the sacred and profane, it could also disjoin them since while there were sacrifices that brought the spirit into the human world and made the profane sacred, there were also sacrifices that removed the spirit from the human world and made what had become sacred, profane. The former Hubert and Mauss called rituals of ‘sacralization;’ the latter, rituals of ‘desacralization.’ Even though Evans-Pritchard—who was very much influenced by their ideas—found this distinction among the Nuer (1956, pp. 197–230), it is by no means as universal as they supposed. Hubert and Mauss were intrigued by those sacrifices in which it is the deity that offers his life for the benefit of his worshippers. As they saw it, the exemplary sacrifice was that of a god, who, through unqualified self-abnegation, offers himself to humankind, in this way bestowing life on lesser beings. The most celebrated example of this is the central ritual of the Catholic religion, the Mass, where Christ’s sacrifice on the Cross is re-enacted. In a sacrifice of this nature the relationship between the deity and worshippers raises the question of relative status. From Hubert and Mauss’ perspective the relationship is one of grave imbalance since the deity gives more than it gets. This may appear so, but Valeri (1985, pp. 66–7) has advanced a sophisticated argument designed to prove that sacrifice actually creates a bond of ‘mutual’ 13440
indebtedness between the human and the divine, which makes each party dependent upon the other. This proposition may not be true universally, but support for Valeri’s argument comes from the bear sacrifices carried out by the Ainu of Japan. In these, the immolation of bears brings fertility to the community and nurture to the spirits (Irimoto 1996). Durkheim’s (1912) distinctive contribution, which otherwise relied heavily upon Hubert and Mauss, and Robertson Smith, was to promote the importance of sacrificial activities in bringing about social cohesion and the values that contribute to this cohesion. In Timor, a South Pacific island, a ritual is performed in which a local king is figuratively slain by members of two ethnically different communities (Hicks 1996). The king, who comes from one of these groups, symbolizes their unity as a single social entity at the same time that he symbolizes their local god. In addition, by periodically immolating their king the two groups regenerate their notions of ‘society’ and ‘god’ as epistemological categories.
2.3 Sacrifice as Causality For Hocart (1970, p. 217) the most compelling reason for performing collective rituals was the ‘communal pursuit of fertility.’ A sacrifice is offered to the gods in order that life and the fertility that engenders and sustains it are acquired by the community, a view that focuses upon sacrifice as a practical device to affect empirical transformation. Catholic rituals before the Reformation were thought of in this way; afterwards they were reinterpreted as a symbolic process (Muir 1997). For Catholics the Mass materially transformed the bread into Christ’s body; for Protestants the bread merely represented Christ’s sacrifice.
2.4 Sacrifice as Symbol System More contemporary scholars, such as Valeri (1985), have emphasized the symbolic character of sacrificial behavior, analyzing its structure and isolating its motifs within the wider context of a society’s ideology. In this approach, the array of symbols isolated is then apprehended as a semantic system whose meaning requires decoding. Valeri’s interpretation of Hawaiian sacrifice, for example, requires that sacrificial rituals be understood as a symbolic action ‘that effects transformations of the relationships of sacrifier, god, and group by representing them in public’ (1985, pp. 70–1). Although vulnerable to the criticism that it relies too heavily on the subjective bias of the analyst, this approach has the advantage of enabling sacrificial practices to be understood within a wider social context, at the same time as it allows an analysis of sacrifice to open up a distinctive perspective on society.
Safety, Economics of 2.5 Sacrifice as Catharsis Another approach to sacrifice is to focus on violence. Girard (1972), most notably, has suggested that sacrifice offers a socially controlled, and therefore, socially acceptable, outlet for the aggressive urges of human beings. The typical manner by which this is accomplished is to use a sacrificial victim as a scapegoat. A more recent scholar who has stressed the importance of violence in sacrifice is Bloch (1992), but his approach differs from that of Girard in that for Bloch sacrifice requires violence because human beings need ‘to create the transcendental in religion and politics,’ and that this is attainable through violence done to the victim (Bloch 1992, p. 7). One difficulty with this approach is that behavior that one might arguably class as sacrificial may involve no violence at all.
3. Origins of Sacrifice Sacrificial rituals have an extensive pedigree. One of the earliest for which a reasonably detailed record is available dates from about 3,000 years ago. The Vedic Aryans held sacrifice to be their central religious activity (Flood 1996, pp. 40–1), and it consisted of putting milk, grains of rice and barley, and domesticated animals into a sacred fire that would transport the offerings to supernatural entities. As the Agnicayana, this ritual is still carried out in Kerala, southwest India. Evidence for sacrificial behavior extends further back than Vedic times, to the civilizations of Sumer 5,000 years ago and China 4,000 years ago, and its presence seems accounted for even further in the past. This long history, taken with the widespread practice of sacrifice all over the world, suggests that a case could be made that sacrificial behavior is one of the natural dispositions common to the species Homo sapiens. As Needham (1985, p. 177) has averred for ritual in general, it may be that sacrifice, ‘Considered in its most characteristic features … is a kind of activity—like speech or dancing—that man as a ceremonial animal happens naturally to perform.’ See also: Commemorative Objects; Exchange in Anthropology; Honor and Shame; Religion: Morality and Social Control; Ritual; Trade and Exchange, Archaeology of
Bibliography Benveniste E 1969 Le Vocabulaire des institutions indoeuropeT ennes, I. Economie, parenteT , socieT teT ; II. Pouoir, droit, religion. Les Editions de Minuit, Paris Bloch M 1992 Prey into Hunter: The Politics of Religious Experience. Cambridge University Press, Cambridge, UK Durkheim E 1912\1968 Les Formes eT leT mentaires de la ie religieuse: le systeZ me toteT mique en Australie. Presse Universitaires de France, Paris
Evans-Pritchard E E 1956 Nuer Religion. Clarendon Press, Oxford, UK Flood G O 1996 An Introduction to Hinduism. Cambridge University Press, Cambridge, UK Frazer J G 1922 The Golden Bough: A Study in Magic and Religion, Vol. 1. Macmillan, London Girard R 1972 La Violence et le sacreT . Bernard Grasset, Paris de Heusch L 1986 Le Sacrifice dans les religions africaines. Editions Gallimard, Paris Hicks D 1996 Making the king divine: A case study in ritual regicide from Timor. Journal of the Royal Anthropological Institute 2: 611–24 Hocart A M 1970 Kings and Councillors: An Essay in the Comparatie Anatomy of Human Society. University of Chicago Press, Chicago Hubert H, Mauss H 1898 Essai sur la nature et la fonction du sacrifice. AnneT e sociologique 2: 29–138 Irimoto T 1996 Ainu worldview and bear hunting strategies. In: Pentika$ inen J (ed.) Shamanism and Northern Ecology (Religion and Society 36). Mouton de Gruyter, Berlin, pp. 293–303 Muir E 1997 Ritual in Early Modern Europe. Cambridge University Press, Cambridge, UK Needham R 1985 Exemplars. University of California Press, Berkeley, CA Simpson J A, Weiner E S C 1933\1989 Oxford English Dictionary, 2nd. edn. Clarendon Press, Oxford, UK Smith W R 1894\1927 Lectures on the Religion of the Semites, 3rd edn. Macmillan, New York Tylor E B 1871\1913 Primitie Culture, 4th edn. Murray, London, Vol. 2 Valeri V 1985 Kinship and Sacrifice: Ritual and Society in Ancient Hawaii (trans. Wissing P). University of Chicago Press, Chicago
D. Hicks
Safety, Economics of The economics of safety is concerned with the causes and prevention of accidents resulting in economic losses. The primary focus is on accidents causing workplace injuries and diseases. The topic is examined using several variants of economic theory. One set of theories applies when there is an ongoing relationship between the party who is generally the source of the accidents and the party who generally bears the costs of the accidents. Negotiations (explicit or implicit) between the parties can provide ex ante economic incentives to reduce accidents. Another set of theories applies when there is no ongoing relationship between the party who causes the accident and the party who suffers the consequences. Ex post financial penalties for the party who caused the harm can deter others from causing accidents.
1. ‘ Pure ’ Neoclassical Economics The neoclassical theory of work injuries is examined in Thomason and Burton (1993). Workplace accidents are undesirable by-products of processes that produce 13441
Safety, Economics of goods for consumption (Oi 1974). Several assumptions are used in the neoclassical analysis of these processes. Employers and workers have an ongoing relationship. Employers maximize profits. Workers maximize utility, not just pecuniary income. Workers at the margin are mobile and possess accurate information concerning the risks and consequences of work injuries. One implication of these assumptions is that employers pay higher wages for hazardous work than for nonhazardous work. On the assumption that accident insurance is not provided by employers or the government, the equilibrium wage for the hazardous work will include a risk premium equal to the expected costs of the injuries, which includes lost wages, medical care, and the disutility caused by the injury. If workers are risk averse, the premium will also include a payment for the uncertainty about who will be injured. The employer has an incentive to invest in safety in order to reduce accidents and thus the risk premium. The firm will make safety investments until the marginal expenditure on safety is equal to the marginal reduction in the risk premium. Since there is a rising marginal cost to investments in safety, equilibrium will occur with a positive value for the risk premium, which means that in equilibrium there will be some work injuries. Variants on neoclassical economics theory can be generated by changing some of the previous assumptions. For example, if all workers purchase actuarially fair insurance covering the full costs of work injuries, they will be indifferent about being injured. Furthermore, if injured, workers will have no incentive to return to work since the insurance fully compensates them for their economic losses. The change in worker behavior resulting from the availability of insurance is an example of the moral hazard problem, in which the insurance increases the quantity of the events being insured against (i.e., the occurrence of injuries and the durations of the resulting disabilities). These variants on neoclassical economics with their implications for employee behavior are discussed in Burton and Chelius (1997). 1.1 Eidence Consistent with the Neoclassical Model There is a burgeoning literature on compensating wage differentials for the risks of workplace death and injury. Ehrenberg and Smith (1996) report that generally studies find industries having the average risk of job fatalities (about 1 per 10,000 workers per year) pay wages that are 0.5 percent to 2 percent higher than the wages for comparable workers in industries with half that level of risk. Viscusi (1993) reviewed 17 studies estimating wage premiums for the risks of job injuries, and 24 studies estimating premiums for the risks of job fatalities. The total amount of risk premiums implied by these various estimates is substantial: probably the upper bound is the Kniesner and Leeth (1995) figure of $200 billion in risk premiums in the US in 1993. 13442
The Viscusi (1993) survey of studies of risk premiums for workplace fatalities included evidence from the United Kingdom (UK), Quebec, Japan, and Australia. More recently, Siebert and Wie (1994) found evidence for risk premiums for work injuries in the UK, and Miller et al. (1997) provided additional evidence on compensating differentials for risk of death in Australia. 1.2 Qualifications Concerning the Neoclassical Theory The model that postulates workplace safety results from risk premiums that provide financial incentives to employers to invest in safety has been challenged. Some critics assert that risk premiums are inadequate because workers lack sufficient information about the risks of work injuries and\or have limited mobility to move to less hazardous jobs. Ehrenberg and Smith (1996) conclude that workers have enough knowledge to form accurate judgments about the relative risks of various jobs. However, Viscusi (1993) refers to a sizable literature in psychology and economics documenting that individuals tend to overestimate low probability events, such as workplace injuries, which could result in inappropriately large risk premiums. Both Ehrenberg and Smith (1996) and Viscusi (1993) argue there is sufficient worker mobility to generate risk premiums. Another challenger to the neoclassical model is Ackerman (1988), who argues that the labor market does not generate proper risk premiums because certain costs resulting from work injuries are not borne by workers but are externalized. The empirical evidence on risk premiums has also been challenged or qualified. Ehrenberg and Smith (1996), for example, conclude that the studies of compensating wage differentials for the risks of injury or death on the job ‘ are generally, but not completely, supportive of the theory. ’ Viscusi (1993) concludes: ‘ The wage-risk relationship is not as robust as is, for example, the effect of education on wages. ’ The most telling attack on the compensating wage differential evidence is by Dorman and Hagstrom (1998), who argue that most studies have been specified improperly. They find that, after controlling for industry-level factors, the only evidence for positive compensating wage differential pertains to unionized workers. For nonunion workers, they argue that properly specified regressions suggest that workers in dangerous jobs are likely to be paid less than equivalent workers in safer jobs. The Dorman and Hagstrom challenge to the previously generally accepted view—that dangerous work results in risks premiums for most workers—is likely to result in new empirical studies attempting to resuscitate the notion of risk premiums. However, even if every empirical study finds a risk premium for workplace fatalities and injuries, the evidence would not conclusively validate the neoclassical economics
Safety, Economics of approach. Ehrenberg (1988) provides the necessary qualification: if there are any market imperfections (such as lack of information or mobility), the ‘ mere existence of some wage differential does not imply that it is a fully compensating one. ’ And if the risk premium is not fully compensating (or is more than fully compensating), then inter alia, the market does not provide the proper incentive to employers to invest in safety. Another qualification concerning the use of risk premiums as a stimulus to safety is that institutional features in a country ’s labor market may aid or impede the generation of risk premiums. In the US, the relative lack of government regulation of the labor market and the relative weakness of unions probably facilitate the generation of risk premiums. In countries like Germany, where unions are relatively strong and apparently opposed to wage premiums for risks, and where labor markets appear to be less flexible than those in the US because of government regulation of the labor market, wages are less likely to reflect underlying market forces that produce risk premiums.
2. Modified Neoclassical Economics and the Old Institutional Economics Some economists rely on a ‘ modified ’ version of neoclassical economics theory that recognizes limited types of government regulation of safety and health are appropriate in order to overcome some of the attributes of the labor market that do not correspond to the assumptions of pure neoclassical economics. These attributes include the lack of sufficient knowledge and mobility by employees, and the possible lack of sufficient knowledge or motivation of employers about the relationship between expenditures on safety and the reduction in risk premiums. The ‘ old institutional economists ’ (OIE) would agree with these critiques of lack of knowledge and mobility, and would also emphasize factors such as the unequal bargaining power of individual workers as another limitation of the labor market. An example of government promulgation of information in order to overcome the lack of knowledge in the labor market is the Occupational Safety and Health Act (OSHAct) Hazard Communication standard, which requires labeling of hazardous substances and notification to workers and customers and which is estimated to save 200 lives per year (Viscusi 1996). (Other aspects of the OSHAct are discussed below.)
2.1
Experience Rating
An example of a labor market intervention that could improve incentives for employers to invest in safety is the use of experience rating in workers ’ compensation programs, which provide medical and cash benefits to
workers injured at work. Firm-level experience rating is used extensively in the workers ’ compensation programs in the US, and is used to some degree in many other countries, including Canada, France, and Germany. Firm-level experience rating determines the workers ’ compensation premium for each firm above a minimum size by comparing its prior benefit payments to those of other firms in the industry. In the pure neoclassical economics model, the introduction of workers ’ compensation with experience rating should make no difference in the safety incentives for employers compared to the incentives provided by the labor market without workers ’ compensation. Under assumptions such as perfect experience rating, risk neutrality by workers, and actuarially fair workers ’ compensation premiums, which are explicated by Burton and Chelius (1997), the risk premium portion of the wage paid by the employer will be reduced by an amount exactly equal to the amount of the workers ’ compensation premium. Also, under these assumptions, the employer has the same economic incentives to invest in safety both before and after the introduction of the workers ’ compensation program. Under an alternative variant of the pure neoclassical economics approach, in which the assumption of perfect experience rating is dropped, the introduction of workers ’ compensation will result in reduced incentives for employers to reduce accidents. In contrast, the OIE approach argues that the introduction of workers ’ compensation with experience rating should improve safety because the limitations of knowledge and mobility and the unequal bargaining power for employees mean that the risk premiums generated in the labor market are inadequate to provide employers the safety incentives postulated by the pure neoclassical economics approach. Commons (1934), a leading figure in the OIE approached, claimed that unemployment is the leading cause of labor market problems, including injuries and fatalities, because slack labor markets undercut the mechanism that generates compensating wage differentials. Commons asserted that experience rating provides employers economic incentives to get the ‘ safety spirit ’ that would otherwise be lacking. The modified neoclassical economics approach also accepts the idea that experience rating should help improve safety by providing stronger incentives to employers to avoid accidents, although they place less emphasis on the role of unemployment in undercutting compensating wage differentials and more emphasis on the failure of employers to recognize the cost savings possible from improved safety without the clear signals provided by experience-rated premiums. A number of recent studies of the workers ’ compensation program provide evidence that should help assess the virtues of the various economic theories. However, the evidence from US studies is inconclusive. One survey of studies of experience rating by Boden (1995) concluded that ‘ research on the safety impacts 13443
Safety, Economics of has not provided a clear answer to whether workers ’ compensation improves workplace safety. ’ In contrast, a recent survey by Butler (1994) found that most recent studies provide statistically significant evidence that experience rating ‘ has had at least some role in improving workplace safety for large firms. ’ Burton and Chelius (1997) sided with Butler. The beneficial effect of experience rating on safety has been found in Canada by Bruce and Atkins (1993) and in France by Aiuppa and Trieschmann (1998). The mildly persuasive evidence that experience rating improves safety is consistent with the positive impact on safety postulated by the OIE approach and the modified neoclassical economists, and is inconsistent with the pure neoclassical view that the use of experience rating should be irrelevant or may even lead to reduced incentives for employers to improve workplace safety.
2.2 Collectie Action While the modified neoclassical economists and the OIE largely would agree on the desirability of several prevention approaches, such as the use of experience rating in workers ’ compensation, the OIE endorsed another approach that many modified neoclassical economists would not support, namely the use of collective bargaining and other policies that empower workers. Collective bargaining agreements can minimize unsafe activities or at least explicitly require employers to pay a wage premium for unsafe work. Many agreements also establish safety committees that assume responsibility for certain activities, such as participation in inspections conducted by government officials. If workers are injured, unions can help them obtain workers ’ compensation benefits, thereby increasing the financial incentives for employers to improve workplace safety. The beneficial effects predicted by the OIE approach for these collective efforts appear to be achieved. Several studies, including Weil (1999) concluded that OSHA enforcement activity was greater in unionized firms than in nonunionized firms. Moore and Viscusi (1990) and other researchers found that unionized workers receive larger compensating wage differentials for job risks than unorganized workers. Hirsch et al. (1997) found that ‘ unionized workers were substantially more likely to receive workers ’ compensation benefits than were similar nonunion workers. ’ Experience rating means that these higher benefit payments will provide greater economic incentives for these unionized employers to reduce injuries. There are also safety committees in some jurisdictions that are utilized in nonunionized firms. Burton and Chelius (1997) reviewed several studies involving such committees in the US and Canada, and found limited empirical support for their beneficial effects. Reilly et al. (1995) provided a more positive assessment 13444
for their accomplishments in the UK. These studies provide some additional support for the beneficial role of collective action postulated by the OIE.
3. The New Institutional Economics New institutional economics (NIE) authors generally argue that market forces encourage efficient forms of economic organization without government assistance and that opportunities for efficiency-improving public interventions are rare. This section considers only the Coase theory\transaction costs economics strain of NIE.
3.1 The Coase Theory and Transaction Costs Economics In the absence of costs involved in carrying out market transactions, changing the legal rules about who is liable for damages resulting from an accident will not affect decisions involving expenditures of resources that increase the combined wealth of the parties. The classic example offered by Coase (1988) involves the case of straying cattle that can destroy crops growing on neighboring land. The parties will negotiate the best solution to the size of the herd, the construction of a fence, and the amount of crop loss due to the cattle whether or not the cattle-rancher is assigned liability for the crop damages. Coase recognized that the assumption that there were no costs involved in carrying out transactions was ‘ very unrealistic. ’ According to Coase (1988): In order to carry out a market transaction, it is necessary to discover who it is that one wishes to deal with … and on what terms, to conduct negotiations …, to draw up the contract, to undertake the inspection needed to make sure that the terms of the contract are being observed, and so on. These operations are often … sufficiently costly … to prevent many transactions that would be carried out in a world in which the pricing system worked without cost.
The goal of transactions costs economics is to examine these costs and to determine their effect on the operation of the economy. As scholars of transaction costs have demonstrated, when transactions costs are significant, changing the legal rules about initial liability can affect the allocation of resources.
3.2
Eidence Concerning Changes in Liability Rules
Workplace safety regulation provides a good example of a change in liability rules, since in a relatively short period (1910–20), most US states replaced tort suits (the employer was only responsible for damages if negligent) with workers ’ compensation (the employer
Safety, Economics of is required to provide benefits under a no-fault rule) as the basic remedy for workplace injuries. Evidence from Chelius (1977) ‘ clearly indicated that the death rate declined after workers ’ compensation was instituted as the remedy for accident costs. ’ This result suggests that high transaction costs associated with the determination of fault in negligence suits were an obstacle to achieving the proper incentives to workplace safety, and that the institutional features of workers ’ compensation, including the no-fault principle and experience rating, provided a relatively more efficient approach to the prevention of work injuries. An interesting study that also suggests that institutional features can play a major role in determining the effects of changing liability rules is Fishback (1987). He found that fatality rates in coal mining were generally higher after states replaced negligence law with workers ’ compensation. Fishback suggested that the difference between his general results and those of Chelius might be due to high supervision costs in coal mining in the early 1900s. The transaction economics component of the NIE theory thus appears to provide a useful supplement to neoclassical economics, since institutional features, such as liability rules, can have a major impact on the economic incentives for accident prevention.
4. Law and Economics Law and economics (L&E) theory draws on neoclassical economics and transaction cost economics, but is distinctive in the extent to which it examines legal institutions and legal problems. This section examines the tort law branch of L&E theory, while the next section examines the employment law branch.
4.1 Theoretical Stimulus of Tort Law to Safety Tort law typically is used when one party harms another party, and the parties do not have an ongoing relationship. However, even though workers and employers have a continuing relationship, tort suits were generally used as a remedy for workplace accidents in the US until workers ’ compensation programs were established. When negligence is the legal standard used for tort suits, if the employer has not taken proper measures to prevent accidents and thus is at fault, the employer will be liable for all of the consequences of the injury. The standard for the proper prevention measure was developed by Judge Learned Hand and restated by Posner (1972) as: The judge (or jury) should attempt to measure three things in ascertaining negligence: the magnitude of the loss if an accident occurs; the probability of the accident ’s occurring; and the burden (cost) of taking precautions to prevent it. If
the product of the first two terms, the expected benefits, exceeds the burden of precautions, the failure to take those precautions is negligence.
Posner argued that proper application of this standard would result in economically efficient incentives to avoid accidents. Burton and Chelius (1977) examine some qualifications to this conclusion.
4.2
Eidence on the Tort Law Stimulus to Safety
There are two types of empirical evidence that suggest skepticism is warranted about the stimulus to workplace safety from tort suits. First, tort suits were used as the remedy for workplace injuries in the late 1800s and early 1900s. As previously noted, Chelius (1977) found that the replacement of the negligence remedy with workers ’ compensation led to a general reduction in workplace fatalities. The Fishback (1987) contrary result for a specific industry—coal mining—provides a qualification to this general result. Second, in other areas of tort law, there is a major controversy among legal scholars about whether the theoretical incentives for safety resulting from tort suits actually work. One school of thought is exemplified by Landes and Posner (1987), who state that ‘ although there has been little systematic study of the deterrent effect of tort law, what empirical evidence there is indicates that tort law … deters. ’ An opposing view of the deterrent effects of tort law is provided by Priest (1991), who finds almost no relationship between actual liability payouts and the accident rate for general aviation and states that ‘ this relationship between liability payouts and accidents appears typical of other areas of modern tort law as well, such as medical malpractice and product liability. ’ A study of tort law by Schwartz (1994) distinguished a strong form of deterrence (as postulated by Landes and Posner) from a moderate form of deterrence, in which ‘ tort law provides a significant amount of deterrence, yet considerably less that the economists ’ formulae tend to predict. ’ Schwartz surveyed a variety of areas where tort law is used, including motorist liability, medical malpractice, and product liability, and concluded that that the evidence undermines the strong form of deterrence but provides adequate support for the moderate form of deterrence. As to workers ’ injuries, Schwartz cited the Chelius and Fishback studies and concluded ‘ it is unclear whether a tort system or workers ’ compensation provides better incentives for workplace safety. ’ Burton and Chelius (1997) concluded, based on both the ambiguous historical experience of the impact of workers ’ compensation on workplace safety and the current controversy over the deterrence effect in other areas of tort law, that ‘ the law and economics theory concerning tort law does not provide much assistance in 13445
Safety, Economics of designing an optimal policy for workplace safety and health. ’ Further examinations of the relative merits of workers ’ compensation and the tort system in dealing with workplace accidents are Dewees et al. (1996) and Thomason et al. (1998).
5. Goernment Mandate Theory s. Law and Economics Theory 5.1
The Goernment Mandate Theory
The government mandate theory argues that the government promulgation of health and safety standards and enforcement of the standards by inspections and fines will improve workplace safety and health. This is basically a legal theory, although many of the supporting arguments involve reinterpretation or rejection of studies conducted by economists. For example, proponents of the theory object to the evidence on compensating wage differentials and on the deterrent effect of experience rating and object in principle to economists ’ reliance on cost-benefit analysis. Supporters of the theory, such as McGarity and Shapiro (1996), also provide a positive case why OSHA is necessary: OSHA ’s capacity to write safety and health regulations is not bounded by any individual worker ’s limited financial resources. Likewise, OSHA ’s capacity to stimulate an employer to action does not depend upon the employees ’ knowledge of occupation risks or bargaining power.
The government mandate theory would not be endorsed by many economists, including the OIE. Commons and Andrews (1936), for example, criticized at length the punitive approaches that used factory inspectors in the form of policemen, since this turned employers into adversaries with the law. The minimum standards supported by the OIE were those developed by a tripartite commission, involving employers, employees, and the public, rather than standards promulgated by the government. While the OIE theory is, thus, unsympathetic to the government mandate theory, the sharpest attack is derived from the L&E theory.
5.2 The L&E Theory Concerning Goernment Regulations L&E scholars make a distinction between mandatory, minimum terms (standards) and those terms that are merely default provisions (or guidelines) that employers and employees can agree to override. Most employment laws, including workplace safety laws, create standards and are thus objectionable to the L&E scholars. 13446
Willborn (1988) articulated the standard economic objections to mandatory terms. Employers will treat newly imposed standards like exogenous wage increases and in the short run will respond by laying off workers. In the long run, employers will try to respond to mandates by lowering the wage. The final wagebenefits-standards employment package will make workers worse off than they were before the imposition of standards—otherwise the employers and workers would have bargaining for the package without legal compulsion.
5.3
Eidence on the Effects of OSHA Standards
The evidence, as reviewed by Burton and Chelius (1997), suggests that the OSHAct has done little to improve workplace safety, thus lending more support to the L&E theory than to the government mandate theory. OSHA ’s ineffectiveness in part may be due to the lack of inspection activity, since the average establishment is only inspected once every 84 years. But the evidence also suggests that allocating additional resources to plant inspections may be imprudent. Smith (1992) concluded, after an exhaustive review of the studies of OSHA inspections, that the evidence ‘ suggests that inspections reduce injuries by 2 percent to 15 percent, ’ although the estimates often are not statistically significant (and thus cannot be distinguished confidently from zero effect). Several studies, including Scholz and Gray (1990) and Weil (1996) provide a more favorable assessment of the OSHA inspection process. However, even Dorman (1996), who supports an aggressive public policy to reduce workplace injuries, provided a qualified interpretation of such evidence: ‘ even the most optimistic reading indicates that … more vigorous enforcement alone cannot close the gap between U.S. safety conditions and those in other OECD countries. ’ In addition to the questionable effectiveness of OSHA inspections, some of the standards promulgated by OSHA have been criticized as excessively stringent. Viscusi (1996) examined OSHA standards using an implicit value of life of $5 million (derived from the compensating wage differential studies) as the standard for an efficient regulation. Four of the five OSHA safety regulations, but only one of the five OSHA health regulations adopted as final rules, had costs per life saved of less than $5 million. This evidence caused Burton and Chelius (1997) to provide a strong critique of the government mandate theory: To be sure, cost-benefit analysis of health standards issued under the OSHAct is not legal, and so those standards that fail the cost-benefit test (considering both the lives saved plus injuries and illnesses avoided) do not violated the letter and presumably the purpose of the law. But to the extent that the rationale offered by the government mandate theorists for regulation of health is that workers lack enough information to make correct decisions and therefore the government is in
Safety, Economics of a better position to make decisions about how to improve workplace health, the evidence on the variability of the cost\benefit ratios for OSHA health standards is disquieting. Rather than OSHA standards reflecting interventions in the marketplace that overcome deficiencies of the marketplace, the explanation of why the stringency of regulation varies so much among industries would appear at best to be a result of technology-based decisions that could well aggravate the alleged misallocation of resources resulting from operation of the market and at worst could reflect relative political power of the workers and employers in various industries.
6. Conclusions This review suggests that understanding the economics of workplace safety involves a rather eclectic mix of theories. Burton and Chelius (1997) were least impressed with the arguments and evidence pertaining to the pure neoclassical economics and the government mandate theories. They concluded that among the other economic theories pertaining to safety, no single theory provides an adequate understanding of the causes and prevention of workplace accidents. Rather, a combination of the theories, though untidy, is needed.
Bibliography Ackerman S 1988 Progressive law and economics—and the new administrative law. Yale Law Journal 98: 341–68 Aiuppa T, Trieschmann J 1998 Moral hazard in the French workers ’ compensation system. Journal of Risk and Insurance 65: 125–33 Boden L 1995 Creating economic incentives: lessons from workers ’ compensation systems. In: Voos P (ed.) Proceedings of the 47th Annual Meeting. Industrial Relations Research Association, Madison, WI Bruce C J, Atkins F J 1993 Efficiency effects of premium-setting regimes under workers ’ compensation: Canada and the United States. Journal of Labor Economics 11: Part 2: S38–S69 Burton J, Chelius J 1997 Workplace safety and health regulations: Rationale and results. In: Kaufman B (ed.) Goernment Regulation of the Employment Relationship. 1st edn. Industrial Relations Research Association, Madison, WI Butler R 1994 Safety incentives in workers ’ compensation. In: Burton J, Schmidle T (eds.) 1995 Workers ’ Compensation Year Book. LRP Publications, Horsham, PA Chelius J R 1977 Workplace Safety and Health. American Enterprise Institute for Public Policy Research, Washington, DC Coase R H 1988 The Firm, the Market, and the Law. University of Chicago Press, Chicago Commons J R 1934 Institutional Economics: Its Place in Political Economy. Macmillan, New York Commons J R, Andrews J B 1936 Principles of Labor Legislation, 4th edn. MacMillan, New York Dewees D, Duff D, Trebilcock M 1996 Exploring the Domain of Accident Law: Taking the Facts Seriously. Oxford University Press, Oxford Dorman P 1996 Markets and Mortality: Economics, Dangerous Work, and the Value of Human Life. Cambridge University Press, Cambridge
Dorman P, Hagstrom P 1998 Wage compensation for dangerous work revisited. Industrial and Labor Relations Reiew 52: 116–35 Ehrenberg R G 1988 Workers ’ Compensation, Wages, and the Risk of Injury. In: Burton J (ed.) New Perspecties on Workers ’ Compensation. ILR Press, Ithaca, NY Ehrenberg R G, Smith R S 1996 Modern Labor Economics: Theory and Public Policy 6th edn. Addison-Wesley, Reading, MA Fishback P V 1987 Liability rules and accident prevention in the workplace: Empirical evidence from the early twentieth century. Journal of Legal Studies 16: 305–28 Hirsch B T, Macpherson D A, Dumond J M 1997 Workers ’ compensation recipiency in union and nonunion workplaces. Industrial and Labor Relations Reiew 50: 213–36 Kniesner T, Leeth J 1995 Abolishing OSHA. Regulation 4: 46–56 Landes W M, Posner R A 1987 The Economic Structure of Tort Law. Harvard University Press, Cambridge, MA McGarity T, Shapiro S 1996 OSHA ’s Critics and Regulatory reform. Wake Forest Legal Reiew 31: 587–646 Miller P, Mulvey C, Norris K 1997 Compensating differentials for risk of death in Australia. Economic Record 73: 363–72 Moore M J, Viscusi W K 1990 Compensation Mechanisms for Job Risks: Wages, Workers ’ Compensation, and Product Liability. Princeton University Press, Princeton, NJ Oi W Y 1974 On the economics of industrial safety. Law and Contemporary Problems 38: 669–99 Posner R 1972 A theory of negligence. Journal of Legal Studies 1: 29–66 Priest G L 1991 The modern expansion of tort liability: its sources, its effects, and its reform. Journal of Economic Perspecties 5: 31–50 Reilly B, Paci P, Holl P 1995 Unions, safety committees and workplace injuries. British Journal of Industrial Relations 33: 275–88 Schwartz G T 1994 Reality in the economic analysis of tort law: does tort law really deter? UCLA Law Reiew 42: 377–444 Scholz J T, Gray W B 1990 OSHA enforcement and workplace injuries: a behavioral approach to risk assessment. Journal of Risk and Uncertainty 3: 283–305 Siebert S, Wie X 1994 Compensating wage differentials for workplace accidents: evidence for union and nonunion workers in the UK. Journal of Risk and Uncertainty 9: 61–76 Smith R 1992 Have OSHA and workers ’ compensation made the workplace safer? In: Lewin D, Mitchell O, Sherer P (eds.) Research Frontiers in Industrial Relations and Human Resources. Industrial Relations Research Association, Madison, WI Thomason T, Burton J F 1993 Economic effects of workers ’ compensation in the United States: Private insurance and the administration of compensation claims. Journal of Labor Economics 11: Part 2: S1–S37 Thomason T, Hyatt D, Roberts K 1998 Disputes and dispute resolution. In: Thomason T, Burton J, Hyatt D (eds.) New Approaches to Disability in the Workplace. Industrial Relations Research Association, Madison, WI Viscusi W K 1993 The value of risks to life and health. Journal of Economic Literature 31: 1912–46 Viscusi W K 1996 Economic foundations of the current regulatory reform efforts. Journal of Economic Perspecties 10: 119–34 Weil D 1996 If OSHA is so bad, why is compliance so good? Rand Journal of Economics 27: 618–40
13447
Safety, Economics of Weil D 1999 Are mandated health and safety committees substitutes for or supplements to labor unions? Industrial and Labor Relations Reiew 52: 339–60 Willborn S 1988 Individual employment rights and the standard economic objections: theory and empiricism. Nebraska Legal Reiew 67: 101–39
J. F. Burton, Jr.
Sample Surveys: Cognitive Aspects of Survey Design Cognitive aspects of survey design refers to a research orientation that began in the early 1980s to integrate methods, techniques, and insights from the cognitive sciences into the continuing effort to render data from sample surveys of human populations more valid and reliable, and to understand systematically the threats to such reliability and validity.
These concerns led to a cross-disciplinary research endeavor that would bring the insights and methods of the cognitive sciences, especially cognitive psychology, to bear on the problems raised by surveys, and that would encourage researchers in the cognitive sciences to use surveys as a means of testing, broadening, and generalizing their laboratory-based conclusions. Recent work in the field has incorporated theories and viewpoints from fields not usually classified as the cognitive sciences, notably linguistics, conversational analysis, anthropology, and ethnography. Described below are some of the practical and theoretical achievements of the movement to date. These include methodological transfers from the cognitive sciences to survey research, the broad establishment of cognitive laboratories in governmental and nongovernmental survey research centers, theoretical formulations that account for many of the mysteries generated by survey responses, and practical applications to the development and administration of on-going surveys. Some references appear at the close describing these achievements more fully and speculating on the future course of the field.
1. Background and History While a sample survey of human beings can be of good and measurable accuracy only if it is accomplished through probability sampling (Sample Sureys: The Field and Sample Sureys: Methods), probability sampling is only the first step in assuring that a survey is successful in producing results accurate enough for use in predicting elections, for gauging public opinion, for measuring the incidence of a disease or the amount of acreage planted in a crop, or for any of the other myriad of important uses that depend on survey data. Questions must be worded not only in ways that are unbiased, but that are understandable to respondents and convey the same meaning to respondents as was intended by the survey’s author. If the questions refer to the past, they must be presented in ways that help respondents remember the facts accurately and report truthfully. And interviewers must be able to understand respondents’ answers correctly to record and categorize them appropriately. These and other nonsampling issues have been of concern to survey researchers for many decades (e.g., Payne 1951). Researchers were puzzled by the fact that changing the wording slightly in a question, sometimes resulted in a different distribution of answers and sometimes did not; worried that sometimes the context of other questions affected answers to a particular question, and sometimes did not; concerned that sometimes respondents were able to remember things accurately and sometimes were not. In the late 1970s and early 1980s, as survey data came to be used ever more extensively for public policy purposes, concerns for the validity of those data became especially widespread. 13448
2. Methodological Transfers Much work in the field is rooted in a classic information processing model applied to the cognitive tasks in a survey interview. The model suggests that the tasks the respondent must accomplish, in rough order in which they must be tackled (though respondents go back and forth among them), are comprehension\interpretation, retrieval, judgment, and communication. These models have been intertwined with the earlier models of the survey interview as a social interaction (Sudman and Bradburn 1974) through the efforts of such researchers as Clark and Schober (1992) and Sudman et al. (1996) to take into account both the social communication and individual cognitive processes that play a part in the survey interview. Guided by this model, one of the earliest achievements of the movement to study cognitive aspects of survey design was the adoption of some of the methods of the cognitive psychology laboratory for the pretesting of survey questionnaires. In particular, these methods are aimed at insuring that questions are comprehensible to respondents and that the meanings transmitted are those intended by the investigator. They also can point to problems in retrieval and communication. Such cognitive pretesting includes ‘think-aloud protocols,’ in which specially recruited respondents answer the proposed questionnaire aloud to a researcher, and describe, either concurrently with each question or retrospectively at the end of the questionnaire, their thought processes as they attempt to understand the question and to retrieve or construct an answer. Also used is behavioral coding, based on
Sample Sureys: Cognitie Aspects of Surey Design procedures originally developed by Cannell and coworkers (e.g., Cannell and Oskenberg 1988), which note questions for which respondents ask for clarification or for which their responses are inadequate. Other methods include reviews of questionnaires by cognitively trained experts, the use of focus groups, and for questionnaires that are automated, measures of response latency. Detailed descriptions of these methods and their applicability to detect problems at various stages of the question answering process appear in Sudman et al. (1996), Schwarz and Sudman (1996), and Forsyth and Lessler (1991).
3. Interiewer Behaior Although survey interviewing began with investigators themselves merely going out and having conversations with respondents, as the survey enterprise aspired to greater scientific respectability and as it became sufficiently large-scale so that many interviewers were needed for any project, standardized interviewing became the norm. Printed questionnaires were to be followed to the letter; interviewers were instructed to answer respondents’ requests for clarification by merely re-reading the question or by saying that the meaning of any term is ‘whatever it means to you.’ While it is probably true that skilled interviewers have always deviated from these rigid standards in the interest of maintaining respondent cooperation and getting valid data, the robot-like interviewer delivering the same ‘stimulus’ to every respondent was considered the ideal. Working with videotapes of interviews for the National Health Interview Survey and for the General Social Survey and using techniques of conversational analysis, Suchman and Jordan (1990) showed that misunderstandings frequently arose because of the exclusion of normal conversational resources from the survey interview, and that these misunderstandings resulted not only in discomfort on the part of respondents, but in data that did not meet the needs of the survey designer. Suchman and Jordan recommended a more collaborative interviewing style, in which interviewer and respondent work together to elicit the respondent’s story and then fill in the questionnaire. Note that this recommendation applies almost exclusively when the information sought is about the respondent’s activities or experiences rather than about his\her attitudes or opinions. Both theoretical and practical advances have followed on these insights. On a theoretical level, several researchers have used variants of the ‘cooperativeness’ principle advanced by Grice (1975) to help explain some of the anomalies that have puzzled survey researchers over the decades. Many are understandable as respondents bringing to the survey interview the communicative assumptions and skills that serve well in everyday life. The maxims of the cooperativeness principle require a participant
in a conversation to be truthful, relevant, informative, and clear (Sudman et al. 1996, p. 63). Thus, for example, it has long been known that respondents are willing to report on their attitudes towards fictitious issues or nonexistent groups, and investigators have taken this willingness as evidence of the gullibility of respondents and perhaps of the futility of trying to measure attitudes at all. But to ‘catch on’ to the ‘trick’ nature of such a question, a respondent would have to assume that the interviewer violated all the maxims of the cooperativeness principle. Parsimony suggests, instead, that the respondent would assume that the interviewer knew what s\he was talking about, and answer accordingly. Several puzzles about context effects, discussed below, also seem understandable in light of Grice’s maxims. On a practical level, freeing interviewers to be more conversational with respondents is not costless; when interviewers explain more, interviews take longer, and are thus more costly in interviewers’ wages and respondents’ patience. Schober and Conrad (e.g., 1997) have embarked on a program of research that has demonstrated that conversational interviewing generates better data than does standardized interviewing when the mappings of the respondent’s experiences onto the survey’s concepts are complicated, but indeed at the cost of longer interviews. These investigators are now experimenting with interviewing styles that fall at several points along the continuum between fully standardized and fully conversational, seeking a point that balances the benefits in accuracy to be gained by a more conversational style with the costs in interview length.
4. Factual Questions s. Questions about Attitudes or Opinions The distinction between questions that ask respondents to report on facts, usually autobiographical, and those that ask for an attitude or an opinion has been an important one in survey research. Indeed, the explicit origins of the movement to study cognitive aspects of survey design lie in a desire to improve the accuracy of factual reports from respondents; the extension of concern to attitudinal questions came somewhat later, although such a concern was certainly prefigured by the work reported in Turner and Martin (1984). The distinction remains important for interviewer behavior, in the sense that the kinds of conversational interviewing being investigated, are designed to help respondents better comprehend the intent of the question and to recall more accurately their experiences. Interviewer interventions designed for these purposes, when the question is aimed at getting a factual report might well bias the respondent’s answer when the aim of the question is to elicit an attitude or opinion. The distinction is also valid in the sense that the cognitive theories about 13449
Sample Sureys: Cognitie Aspects of Surey Design autobiographical memory that have been used to understand problems of the recall of factual information are less applicable to attitudes and opinions, and the cognitive techniques that have been developed to aid respondent recall for autobiographical questions are not appropriate for the recall of attitudes or opinions. But in important ways the distinction obscures similarities between the two types of questions, as will be discussed in the section on context effects, below.
4.1 Improing the Accuracy of Factual Reports Very often survey questions ask for the frequency with which a respondent has accomplished a behavior (e.g., visited a doctor) or experienced an event (e.g., had an illness) during a specified time period (usually starting sometime in the past and ending with the date of the interview) called the reference period. Thus the respondent, having understood what events or experiences that should be included in the report, has a two-fold task: he or she must recall the events or experiences, and also determine whether those events or experiences fell inside the reference period. Theories from the cognitive sciences about how experiences are stored in autobiographical memory and retrieved therefrom have been marshaled to understand both phases of the respondent’s recall task and to improve their accuracy. In a phenomenon called ‘telescoping,’ respondents are known often to move events or experiences forward in time, reporting them as falling within the reference period although in fact they occurred earlier. One of the earliest contributions to the literature on cognitive aspects of survey design was the proposal of a technique dubbed ‘landmarking’ to control telescoping (Loftus and Marburger 1983). Here, rather than inquiring about the last six months or the last year, a respondent is supplied with (or is asked to supply) a memorable date, and then is asked about events or experiences since that date. The related concept of bounding is also useful in controlling telescoping. In a panel survey, in which respondents are interviewed repeatedly at regular intervals, informing them of the events or experiences reported at the last interview will prevent such events or experiences from being placed in the time period between the previous interview and the current one. An analogous technique useable in a single interview is the two-time-frame procedure in which the respondent is first asked the report events or experiences during a long reference period (e.g., the last 6 months) and then for a shorter one that is really of interest to the investigator (e.g., the last 2 months). This technique both seems to relieve the pressure on respondents to report socially desirable events or experiences and to stress the interviewer’s interest in correct dating, encouraging respondents to strive for greater ac13450
curacy. A theoretical explanation of telescoping, taking into account the workings of memory for the storage of timing and the effects of rounding to culturally stereotypical values (e.g., 7 days, 10 days, 30 days), is provided by Hutenlocher et al. (1990). The task of counting or estimating the number of events or experiences in the reference period can also provide a challenge to the respondent. Issues of frequency regularity and similarity influence whether the respondent is likely to actually count the instances or to employ an estimation strategy, as well as influencing whether counting or estimation is likely to be more accurate. In general, respondents are more likely to retrieve from memory and count events or experiences if their number is few and to estimate if their number is large. But regular events (e.g., going to the dentist every 6 months) and similar events (e.g., trips to the grocery store) are more likely to be estimated. And investigators have recently found that estimation (perhaps by retrieving a rate and applying that rate to the length of the reference period) is usually more accurate, than the attempt to retrieve and count regular, similar events or experiences. For a full discussion of these issues, see Sudman et al. (1996).
4.2 Context Effects Context effects occur when the content of the questions preceding a particular question (or even following that question in a self-administered questionnaire in which the respondent is free to return to earlier questions after reading later ones) influence how some respondents answer that question. Our understanding of the term has been broadened recently to include also the effects on the distribution of responses of the response categories offered in a closed-ended question. Investigators have used the maxims of Grice’s cooperativeness principle to approach a systematic understanding of these effects, which have seemed mysterious for many years. Full treatments of these issues may be found in Sudman et al. (1996) and Tourangeau (1999); of necessity only some selected findings can be presented here. A pair of puzzling context effects that have fascinated researchers for years are assimilation effects and contrast effects. An assimilation effect occurs when the content of the preceding items moves the responses to the target item in the direction of the preceding items; a contrast effect when the movement is in the opposite direction. These effects can now be understood through the inclusion\exclusion model of the judgment process in survey responding, proposed by Schwarz and Bless (1992). The model holds that when a respondent has to make a judgment about a target stimulus, s\he must construct a cognitive representation, not only of the target stimulus, but also of a standard with which to compare the target
Sample Sureys: Cognitie Aspects of Surey Design stimulus. To construct these representations respondents must retrieve information from memory; since not all information in memory can be retrieved, respondents retrieve only that which is most easily accessible. And information can be accessible either because it is ‘chronically’ accessible or because it is temporarily accessible; it is the temporary accessibility of information that is affected by the context of the question. Positive temporarily accessible information supplied by the context and included in the representation of the target will render the judgment of the target more positive: negative context information thus included will render the judgment more negative. Both these processes produce assimilation effects. Contrast effects are created when the context suggests that information be excluded from the representation of the target, or included in the representation of the standard to which the target is to be compared. For example, Schwarz and Bless (1992) were able to manipulate the frequency of positive evaluations of a political party by using a reference to a popular politician in a preceding question. They were able to create an assimilation effect, resulting in more positive evaluations, by encouraging respondents to include the politician in their representations of the party (by having them recall that the politician had been a member of the party). They were also able to create a contrast effect, resulting in less positive evaluations of the political party, by encouraging respondents to exclude the politician from their representation of the party (by having them recall that his present position took him out of party politics). The model predicts inclusion of more information when the target is a general question, thus encouraging assimilation effects; it also predicts contrast effects when the target is narrow. Thus, a preceding question about marital happiness seems to result in higher reports of general happiness for those who are happily married, but lower reports of general happiness for those who are unhappy in their marriages. This influence of the preceding question can be eliminated by having respondents see the questions about marital happiness and general happiness as part of a single unit (either by an introduction that links the two questions or by a typographical link on a printed questionnaire); then, in obedience to the conversational maxim that speakers should present new information, respondents exclude their marital happiness from their evaluation of their general happiness. Similar recourse to conversational maxims helps explain the influence of the response alternatives presented in closed answer questions on the distribution of responses. For example, Schwarz et al. (1985) found that 37.5 percent of respondents reported watching TV for more than two and a half hours a day when the response scale ranged from ‘up to 2" hours’ # to ‘more than 4" hours,’ but only 16.2 percent reported # watching more than two and a half hours a day when the response scale ranged from ‘up to " hour’ to ‘more #
than 2" hours.’ Respondents take the response scales # to be conversationally informative about what typical TV watching habits are and calibrate their report accordingly.
5. Some Achieements of the Moement to Study Cognitie Aspects of Sureys 5.1 The Impact of Theories Perhaps the most profound effect of the movement to study cognitive aspects of survey design is the introduction of theory into the research area. There has been space above to present only a few of the many such theoretical advances, but their utility should be clear. For example, consulting the maxims of cooperativeness in conversation gives us a framework for predicting and preventing response effects. In the same sense, the inclusion\exclusion model, much more fully developed than is presented here, makes predictions for what to expect from various manipulations of question ordering and content. These predictions offer guidance for the experimental advance of the research domain, and their testing systematically advances our knowledge, both of the phenomena and of the applicability of the theory.
5.2 The Cognitie Laboratories and Their Impact Cognitive interviews in laboratory settings have become standard practice in several US government agencies (including the National Center for Health Statistics, the Bureau of Labor Statistics, and the Census Bureau), as well as in government agencies around the world, and at many academic and commercial survey organizations. Work in these laboratories supplements the usual large-scale field pretests of surveys. For illustrative purposes it is worth describing two major efforts at US government agencies. The Current Population Survey (CPS) is a largescale government survey, sponsored by the Bureau of Labor Statistics and carried out by the Census Bureau. From interviews with some 50,000 households monthly are derived major government statistical series, including the unemployment rate. In the early 1990s the CPS underwent its decennial revision, this time using the facilities of the government cognitive laboratories as well as traditional field testing of proposed innovations. In particular, this research found that the ordering and wording of the key questions regarding labor force participation caused some respondents to report themselves as out of the labor force when in fact they were either working part time or looking for work. It also found that the concept of layoff, which the CPS takes to mean 13451
Sample Sureys: Cognitie Aspects of Surey Design temporary furlough from work with a specific date set for return, was not interpreted that way by respondents. Respondents seemed to understand ‘on layoff’ as a polite term for ‘fired.’ Appropriate revision of these questions, followed by careful field testing and running the new and old versions in parallel for some months resulted in estimates of the unemployment rate in which policy makers and administrators can have greater confidence. More details can be found in Norwood and Tanur (1994). The US Decennial Census has always asked for a racial characterization of each resident. Originally designed to distinguish whites (mostly free) from blacks (mostly slaves) for purposes of counting for apportionment (the US Constitution originally mandated that a slave be counted as 3\5 of a man), the uses to which answers to the race identification question was put evolved over time. Civil Rights legislation starting in the 1960s designated protected groups, and it thus became important to know the population of such groups. At the same time, individuals were becoming both more conscious of their racial selfidentification and less willing to be pigeonholed into a small set of discrete categories, especially as more individuals were self-identifying with more than one racial group. The 1990 Census asked two questions in this general area—one on racial identity followed by one on Hispanic ethnic identification. Experimental work (Martin et al. 1990) indicated that the ordering of these questions mattered to respondents. For example, when the Hispanic ethnicity question appeared first, fewer people chose ‘other’ as their racial category. This and other research made a convincing case that the complexity of Americans’ racial and ethnic self-identification needed a more complex set of choices on the Census form. A 4-year program of research resulted in the conclusion that allowing respondents to choose as many racial categories as they believe apply to them is a solution superior to the provision of a multiracial category (see Tucker et al. 1996). (For more details on this and other census research see Censuses: History and Methods.) The 2000 Decennial Census allowed respondents to choose as many racial categories as they wished; at this writing it is not yet known what proportion of the population took advantage of that opportunity, or the effects those choices will have on the analyses of the data.
6. Conclusions The movement to study cognitive aspects of surveys has thus given us some theoretical insights and practical approaches. Ideally, we could hope for a theory of the survey designing and responding process, but we are still far from that goal, and may perhaps never be able to attain it. But the frameworks so far provided offer clear guidance of how to proceed in an effort to reach systematic understanding. Far fuller accounts of the results of the movement to study 13452
cognitive aspects of survey design and ideas of future directions can be found in Sirken et al. (1999a), Sirken et al. (1999b), Sudman et al. (1996), and Tanur (1992). See also: Probability: Formal; Probability: Interpretations; Sample Surveys, History of; Sample Surveys: Methods; Sample Surveys: Model-based Approaches; Sample Surveys: Nonprobability Sampling; Sample Surveys: Survey Design Issues and Strategies; Sample Surveys: The Field
Bibliography Cannel C F, Oskenberg L 1988 Observation of behavior in telephone interviews. In: Groves R M, Biemer P B, Lyberg L E, Massey J T, Nichols II W L, Waksberg J (eds.) Telephone Surey Methodology. Wiley, New York Clark H H, Schober M F 1992 Asking questions and influencing answers. In: Tanur J. (ed.) Questions about Questions: Inquiries into the Cognitie Bases of Sureys. Sage, New York Forsyth B H, Lessler J 1991 Cognitive laboratory methods: A taxonomy. In: Biemer P, Groves R M, Lyberg L E, Mathiowetz N, Sudman S (eds.) Measurement Errors in Sureys. Wiley, New York Grice H P 1975 Logic and conversation. In: Cole P, Morgan J L (eds.) Syntax and Semantics, Vol 3: Speech Acts. Academic Press, New York Hippler H J, Schwarz N, Sudman S (eds.) 1987 Social Information Processing and Surey Methodology. SpringerVerlag, New York Huttenlocher J, Hedges L V, Bradburn N M 1990 Reports of elapsed time: Bounding and rounding processes in estimation. Journal of Experimental Psychology: Learning, Memory, and Cognition 16: 196–213 Loftus E F, Marburger W 1983 Since the eruption of Mt. St. Helens, did anyone beat you up? Improving the accuracy of retrospective reports with landmark events. Memory and Cognition 11: 114–20 Martin E, Demaio T J, Campanelli P C 1990 Context effects for census measures of race and Hispanic origin. Public Opinion Quarterly 54: 551–66 Norwood J L, Tanur J M 1994 Measuring unemployment in the nineties. Public Opinion Quarterly 58: 277–94 Payne S L 1951 The Art of Asking Questions. Princeton University Press, Princeton, NJ Schober M F, Conrad F G 1997 Does conversational interviewing reduce survey measurement error? Public Opinion Quarterly 61: 576–602 Schwarz N, Bless H 1992 Constructing reality and its alternatives: Assimilation and contrast effects in social judgment. In: Martin L L, Tesser A. (eds.) The Construction of Social Judgment. Erlbaum, Hillsdale, NJ Schwarz N, Hippler H J, Deutsch B, Strack F 1985 Response categories: Effects on behavioral reports and comparative judgments. Public Opinion Quarterly 49: 388–95 Schwarz N, Sudman S (eds.) 1996 Answering Questions: Methodology for Determining Cognitie and Communicatie Processes in Surey Research. Jossey-Bass, San Francisco Sinaiko H, Broedling L A (eds.) 1976 Perspecties on Attitude Assessment Sureys and their Alternaties. Pendleton, Champaign, IL Sirken M G, Herrmann D J, Schechter S, Schwarz N, Tanur J M, Tourangeau R (eds.) 1999a Cognition and Surey Research. Wiley, New York
Sample Sureys, History of Sirken M G, Jabine T, Willis G, Martin E, Tucker C (eds.) 1999b A New Agenda for Interdisciplinary Surey Research Methods: Proceedings of the CASM II Seminar. US Department of Healthy and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics, Hyattsville, MD Suchman L, Jordan B 1990 Interactional troubles in face-to-face survey interviews. Journal of the American Statistical Association 85: 232–53 Sudman S, Bradburn N M 1974 Response Effects in Sureys: A Reiew and Synthesis. Aldine, Chicago Sudman S, Bradburn N M, Schwarz N 1996 Thinking About Answers: The Application of Cognitie Processes to Surey Methodology. Jossey-Bass, San Francisco Tanur J M (ed.) 1992 Questions about Questions: Inquiries into the Cognitie Bases of Sureys. Sage, New York Tourangeau R 1999 Context effects on answers to attitude questions. In: Sirken M G, Hermann D J, Schwarz N, Tanur J M, Tourangeau R (eds.) Cognition and Surey Research. Wiley, New York Tucker C, McKay R, Kojetin B, Harrison R, de al Puente M, Stinson L, Robison E 1996 Testing methods of collecting racial and ethnic information: Results of the Current Population Survey Supplement on Race and Ethnicity. Bureau of Labor Statistics, Statistical Notes, No. 40 Turner C F, Martin E (eds.) 1984 Sureying Subjectie Phenomena. Sage, New York
J. M. Tanur
Sample Surveys, History of Sampling methods have their roots in attempts to measure the characteristics of a nation’s population by using only a part instead of the whole population. The movement from the exclusive use of censuses to the at least occasional use of samples was slow and laborious. Two intertwined intellectual puzzles had to be solved before the move was complete. The first such puzzle was whether it is possible to derive valid information about a population by examining only a portion thereof (the ‘representative’ method). And the second puzzle concerns the method for choosing that portion. Issues of choice themselves fall into two categories: the structuring of the population itself in order to improve the accuracy of the sample, and whether to select the units for inclusion by purposive or random methods. This article considers the development of methods for the sampling of human populations. We begin by describing two antecedents of modern survey methods—the development of government statistics and censuses and the general development of statistical methodology, especially for ratio estimation. Then we deal with the movement from censuses to sample surveys, examining the debate over representative methods and the rise of probability sampling, including the seminal 1934 paper of Jerzy Neyman. We describe the impact in the United States of the results
obtained by Neyman on sample surveys and the contributions of statisticians working in US government statistical agencies. We end with a brief description of the subsequent establishment of random sampling and survey methodology more broadly and examine some history of the movement from the study of the design of sampling schemes to the study of other issues arising in the surveying of human populations.
1. Antecedents of Modern Sample Surey Methods In order to determine facts about their subjects or citizens, and more often about their number, governments have long conducted censuses. Thus, one set of roots of sample survey methodology is intertwined with the history of methods for census-taking. The origins of the modern census are found in biblical censuses described in the Old Testament (e.g., see the discussions in Duncan (1984) and Madansky (1986) as well as in censuses carried out by the ancient Egyptians, Greeks, Japanese, Persians, and Romans (Taeuber (1978)). For most practical purposes we can skip from biblical times to the end of the eighteenth century and the initiation of census activities in the United States of America, even though there is some debate as to whether Canada, Sweden, or the United States should be credited with originating the modern census (Willcox 1930) (see Censuses: History and Methods). Another antecedent of sampling lies in the problem of estimating the size of a population when it is difficult or impossible to conduct a census or complete enumeration. Attempts to solve this problem even in the absence of formal sampling methods were the inspiration of what we now know as ratio estimation. These ideas emerged as early as the seventeenth century. For example, John Graunt used the technique to estimate the population of England from the number of births and some assumptions about the birth rate and the size of families. Graunt’s research, later dubbed political arithmetic, was based on administrative records (especially parish records) and personal observation, as in his seminal work, Natural and Political Obserations Made Upon the Bills of Mortality, published in 1662. To accomplish what Graunt did by assumption, others looked to hard data from subsets of a population. Thus, Sir Frederick Morton Eden, for example, found the average number of people per house in selected districts of Great Britain in 1800. Using the total number of households in Great Britain from tax rolls (and allowing for those houses missing from the rolls), Eden estimated the population of Great Britain as nine million, a figure confirmed by the first British census of 1801 (Stephan (1948). Even earlier, in 1765 and 1778, civil servants published population estimates for France using an 13453
Sample Sureys, History of enumeration of the population in selected districts and counts of births, deaths, and marriages, for the country as a whole (see Stephan (1948). It was Pierre Simon de Laplace, however, who first formally described the method of ratio estimation during the 1780s, and he then employed it in connection with a sample survey he initiated to estimate the population of France as of September 22, 1802. He arranged for the government to take a sample of administrative units (communes), in which the total population, y, and the number of registered births in the preceding year, x, were measured. Laplace then estimated the total population of France by Yp l Xy\x, where X was the total number of registered births. Laplace was the first to estimate the asymptotic bias and variance of Yp , through the use of a superpopulation model for the ratio p l y\x (see Laplace (1814)). Cochran (1978) describes this derivation and links it to results which followed 150 years later. Quetelet later applied the ratio estimation approach in Belgium in the 1820s but abandoned it following an attack by others on some of the underlying assumptions (see Stigler 1986, pp. 162–6).
2. From Censuses to Sureys The move from censuses to sample surveys to measure the characteristics of a nation’s population was slow and laborious. Bellhouse (1988), Kruskal and Mosteller (1980), and Seng (1951) trace some of this movement, especially as it was reflected in the discussions regarding surveys that took place at the congresses of the International Statistical Institute (ISI). The move succeeded when investigators combined random selection methods with the structuring of the population and developed a theory for relating estimates from samples, however structured, to the population itself. As early as the 1895 ISI meeting, Kiaer (1896) argued for a ‘representative method’ or ‘partial investigation,’ in which the investigator would first choose districts, cities, etc., and then units (individuals) within those primary choices. The choosing at each level was to be done purposively, with an eye to the inclusion of all types of units. That coverage tenet, together with the large sample sizes recommended at all levels of sampling, was what was judged to make the selection representative. Thus the sample was, approximately, a ‘miniature’ of the population. Kiaer used systematic sampling in his 1895 survey of workers as a means of facilitating special tabulations from census schedules for a study of family data in the 1900 Norwegian census. Kiaer actually introduced the notion of random selection in his description of this work by noting that at the lowest level of the sample structure ‘the sample should be selected in a haphazard and random way, so that a sample selected in this manner would turn out in the same way as would have been the case 13454
had the sample been selected through the drawing of lots … ’ (Kiaer 1897, p. 39) The idea of less than a complete enumeration was opposed widely, and Kiaer presented arguments for sampling at ISI meetings in 1897, 1901, and 1903. Lucien March, in a discussion of Kiaer’s paper at the 1903 meeting, formally introduced the concepts of simple random sampling without replacement and simple cluster sampling, although not using these names (Bellhouse 1988). Arthur Lyon Bowley did a number of empirical studies on the validity of random sampling, motivated at least in part by a 1912 paper by Franicis Ysidro Edgeworth (1912). Bowley used large sample normal approximations (but did not conceptualize what he was doing as sampling from a finite population) to actually test the notion of representativeness. Then, he carried out a 1912 sample survey on poverty in Reading, in which he drew the respondents at random (see Bowley 1913), although he appears to have equated simple random sampling with systematic sampling (Bellhouse 1988). At about the same time, in what appears to be an independent line of discovery, Tchouproff and others were overseeing the implementation of sample survey methods in Russia, especially during the First World War and the period immediately following it. This work was partially documented after the Russian Revolution in Vestnik Statistiki, the official publication of the Central Statistical Administration (Zarkovic 1956, 1962). Even though there is clear evidence that Tchouproff had developed a good deal of the statistical theory for random sampling from finite populations during this period (see Seneta 1985), whether he actually implemented forms of random selection in these early surveys is unclear. He later published the formulae for the behavior of sample estimates under simple random sampling and stratified random sampling from finite populations in Tchouproff (1918a, 1918b, 1923a, 1923b). In a seemingly independent development of these basic ideas for sampling in the context of his work on agricultural experiments, Neyman in Splawa-Neyman (1923) described them in the form of the drawing of balls without replacement from an urn. In the resulting 1925 paper, Splawa-Neyman (1925) gave the basic elements of the theory for sampling from finite populations and its relationship with sampling from infinte populations. These results clearly overlapped those of Tchouproff, and two years later following his death, his partisans took Neyman to task for his lack of citation to the Russian work as well as to earlier work of others (see Fienberg and Tanur 1995, 1996) for a discussion of the controversy). By 1925, the record of the ISI suggests that the representative method was taken for granted, and the discussions centered around how to accomplish representativeness and how to measure the precision of sample-based estimates, with the key presentations
Sample Sureys, History of being made by Bowley (1926) and in the same year by the Danish statistician Adolph Jensen. Notions of clustering and stratification were put forward, and Bowley presented a theory of proportionate stratified sampling as well as the concept of a frame, but purposive sampling was still the method of choice. It was not until Gini and Galvani made a purposive choice of which returns of an Italian census to preserve, and found that districts chosen to represent the country’s average on seven variables were, in that sense, unrepresentative on other variables, that purposive sampling was definitively discredited (Gini 1928, Gini and Galvani 1929).
3. Neyman’s 1934 Paper on Sampling In their work on the Italian census, Gini and Galvani seemed to call into question the accuracy of sampling. Neyman took up their challenge in his classic 1934 paper presented before the Royal Statistical Society, ‘On the two different aspects of the representative method.’ In it he compared purposive and random sampling, and concluded that it wasn’t sampling that was problematic but rather Gini and Galvani’s purposive selection. Elements of synthesis were prominent in the paper as well. Neyman explicitly uncoupled clustering and purposive sampling, saying, ‘In fact the circumstance that the elements of sampling are not human individuals, but groups of these individuals, does not necessarily involve a negation of the randomness of the sampling’ (1952, p. 571). He calls this procedure ‘random sampling by groups’ and points out that, although Bowley did not consider it theoretically, he used it in practice in London, as did O. Anderson in Bulgaria. Neyman also combined stratification with clustering to form ‘random stratified sampling by groups,’ and he provided a method for deciding how best to allocate samples across strata (optimal allocation). The immediate effect of Neyman’s paper was to establish the primacy of the method of stratified random sampling over the method of purposive selection, something that was left in doubt by the 1925 ISI presentations by Jensen and Bowley. But the paper’s longer-term importance for sampling was the consequence of Neyman’s wisdom in rescuing clustering from the clutches of those who were the advocates of purposive sampling and integrating it with stratification in a synthesis that laid the groundwork for modern-day multistage probability sampling. Surprisingly, for many statisticians the memorable parts of Neyman’s paper were not these innovations in sampling methodology but Neyman’s introduction of general statistical theory for point and interval estimation, especially the method of confidence intervals (see Estimation: Point and Interal and Frequentist Inference).
As pathbreaking as Neyman’s paper was, a number of its results had appeared in an earlier work by Tschuprow (1923a 1923b), in particular the result on optimal allocation in stratified sampling (see Fienberg and Tanur 1995, 1996 for discussions of this point). The method was derived much earlier, by the Danish mathematician Gram (1883) in a paper dealing with calculations for the cover of a forest based on a sample of trees. Gram’s work has only recently been rediscovered and seems not to have been accessible to either Tchouproff or Neyman. Neyman provided the recipe for others to follow and he continued to explain its use in convincing detail to those who were eager to make random sampling a standard diet for practical consumption (e.g., see Neyman 1952 for a description based on his 1937 lectures on the topic at the US Department of Agriculture Graduate School, as discussed below).
4. The Deelopment of Random Sampling in the United States One might have thought that the resolution of the controversy over the representative method and the articulation of the basic elements of a theory of sample surveys by Neyman would have triggered extensive application of random sampling throughout the world. Surprisingly, this was not the case. With some notable exceptions, for example, in England (see Cochran 1939 and Yates 1946) and India, the primary application occurred in the United States and this led to a spate of new and important methodological developments. As late as 1932, however, there were few examples of probability sampling anywhere in the US federal government (Duncan and Shelton (1978) and the federal statistical agencies had difficulty responding to the demand for statistics to monitor the effects of the programs of President Franklin Roosevelt’s New Deal. In 1933, the American Statistical Association (ASA) set up an advisory committee that grew into the Committee on Government Statistics and Information Services (COGSIS), sponsored jointly by ASA and the Social Science Research Council. COGSIS helped to stimulate the use of probability sampling methods in various parts of the Federal government, and it encouraged employees of statistical agencies to carry out research on sampling theory. For example, to establish a technical basis for unemployment estimates, COGSIS, and the Central Statistical Board which it helped to establish, organized an experimental Trial Census of Unemployment as a Civil Works Administration project in three cities, using probability sampling, and carried out in late 1933 and early 1934. The positive results from this study led in 1940 to the establishment of the first large-scale, ongoing sample survey on employment and unemployment using probability sampling methods. This survey later 13455
Sample Sureys, History of became known as the Current Population Survey and it continues to the present day. Another somewhat indirect outcome of the COGSIS emphasis on probability sampling took place at the Department of Agriculture Graduate School where Deming, who recognized the importance of Neyman’s 1934 paper, invited Neyman to present a series of lectures in 1937 on sampling and other statistical methods (Neyman (1952). These lectures had a profound impact on the further development of sampling theory not simply in agriculture, but across the government as well as in universities. Among those who worked on the probabilitysampling-based trial Census of Unemployment at the Bureau of the Census was Hansen, who was then assigned with a few others to explore the field of sampling for other possible uses at the Bureau, and went on to work on the 1937 sample Unemployment Census. After working on the sample component of the 1940 decennial census (under the direction of Deming), Hansen worked with others to redesign the unemployment survey based on new ideas on multistage probability samples and cluster sampling (Hansen and Hurwitz 1942, 1943). They expanded and applied their approach in various Bureau surveys, often in collaboration and interaction with others, and this effort culminated in 1953 with the publication of a two-volume compendium of theory and methodology (Hansen et al. 1953a, 1953b). Cochran’s (1939) paper, written in England and independently of the US developments, is especially notable because of its use of the analysis of variance in sampling settings and the introduction of superpopulation and modeling approaches to the analysis of survey data. In the 1940s, as results from these two separate schools appeared in various statistical journals, we see some convergence of ideas and results. The theory of estimation in samples with unequal probabilities of selection also emerged around this time (see Horvitz and Thompson 1952, Hansen et al. 1985) (see Sample Sureys: Methods). Statisticians have continued to develop the theoretical basis of alternative methods of probability sampling and statistical inference from sampling data over the past fifty years (see, e.g., Rao and Bellhouse 1990, Sa$ rndal et al. 1992). The issue of statistical inference for models from survey data remains controversial, at least for those trained from a traditional finite sampling perspective.
Paul Lazarsfeld, Rensis Likert, William Ogburn, and Samuel Stouffer (e.g., see Stephan 1948, Converse 1987 for a discussion). Stouffer actually spent several months in England in 1931–32, and learned about sampling and other statistical ideas from Bowley, Karl and Egon Pearson, and R. A. Fisher. He then participated in the Census of Unemployment at the Bureau of the Census. Market research and polling trace their own prehistory to election straw votes collected by newspapers, dating back at least to the beginning of the nineteenth century. Converse (1987) points out, however, a more serious journalistic base; election polls were taken and published by such reputable magazines as the Literary Digest (which had gained a reputation for accuracy before the 1936 fiasco). Then, as now, election forecasting was taken as the acid test of survey validity. A reputation for accuracy in ‘calling’ elections was thought to spill over to a presumption of accuracy in other, less verifiable areas. There was a parallel tradition in market research, dating back to just before the turn of the twentieth century, attempting to measure consumers’ product preferences and the effectiveness of advertising. It was seen as only a short step from measuring the opinions of potential consumers about products to measuring the opinions of the general public about other objects, either material or conceptual. By the mid-1930s there were several well-established market research firms. Many of them conducted election polls in 1936 and achieved much greater accuracy than did the Literary Digest. It was the principals of these firms (e.g., Archibald Crossley, George Gallup, and Elmo Roper) who put polling—election, public opinion, and consumer—on the map in the immediate pre-World War II period. The polling and market research surveys of Crossley, Gallup, Roper, and others were based on a sampling method involving ‘quota controls,’ but did not involve random sampling. Stephan (1948) observed the close link between their work and the method of purposive sampling that had gained currency much earlier in government and academic research circles, but which by the 1930s had been supplanted by random sampling techniques.
5. Market Research and Polling
The 1940s saw a rapid spread of probability sampling methods to a broad array of government agencies. It was, however, only after the fiasco of the 1948 presidential pre-election poll predictions (Mosteller et al. 1949) that market research firms and others shifted towards probability sampling. Even today many organizations use a version of probability sampling with quotas (Sudman 1967).
The 1930s, and especially the period after World War II, however, saw a flowering of survey methodology in market research and polling, as well as in the social sciences more broadly. Initial stimulation came from a number of committees at the Social Science Research Council and social scientists such as Hadley Cantril, 13456
6. From Sampling Theory to the Study of Nonsampling Error
Sample Sureys, History of Amidst the flurry of activity on the theory and practice of probability sampling during the 1940s, attention was being focused on issues of nonresponse and other forms of nonsampling error (e.g., see Deming 1944) such as difficulty in understanding questions, or remembering answers, etc.). A milestone in this effort to understand and model nonresponse errors was the development of an integrated model for sampling and nonsampling error in censuses and surveys, in connection with planning for and evaluation of the 1950 census (Hansen et al. (1951). This analysis-of-variance-like model, or variants of it, has served as the basis of much of the work on nonsampling error since (see Linear Hypothesis; Nonsampling Errors). New developments in sampling since the midtwentieth century have to do less with the design of samples and more to do with the structuring of the survey instrument and interview. Thus, there has been a move from face-to-face interviewing to telephone interviewing (with the attendant problems connected with random-digit dialing), and then to the use of computers to assist in interviewing. Nonsampling errors continue to be studied broadly, often under the rubric of cognitive aspects of survey design (see Nonsampling Errors; Sample Sureys: Cognitie Aspects of Surey Design). See also: Censuses: History and Methods; Government Statistics; Sample Surveys: Methods; Sample Surveys: Survey Design Issues and Strategies; Sample Surveys: The Field; Social Survey, History of; Statistical Methods, History of: Post-1900; Statistical Systems: Censuses of Population
Bibliography Bellhouse D 1988 A brief history of random sampling methods. In: Krishnaiah P, Rao C (eds.) Handbook of Statistics, Vol. 6. North Holland, Amsterdam, pp. 1–14 Bowley A 1913 Working-class households in Reading. Journal of the Royal Statistical Society 76: 672–701 Bowley A 1926 Measurement of the precision attained in sampling. Bulletin of the International Statistical Institute 22(1): 6–62 Cochran W 1939 The use of the analysis of variance in enumeration by sampling. Journal of the American Statistical Association 128: 124–35 Cochran W G 1978 Laplace’s ratio estimator. In: David H (ed.) Contributions to Surey Sampling and Applied Statistics: Papers in Honor of H. O. Hartley. Academic Press, New York, pp. 3–10 Converse J 1987 Surey Research in the United States: Roots and Emergence 1890–1960. University of California Press, Berkeley, CA Deming W 1944 On errors in surveys. American Sociological Reiew 19: 359–69 Duncan J, Shelton W 1978 Reolution in United States Goernment Statistics, 1926–1976. US Government Printing Office, Washington, DC
Duncan O 1984 Notes on Social Measurement. Sage, New York Edgeworth F Y 1912 On the use of the theory of probabilities in statistics relating to society. Journal of the Royal Statistical Society 76: 165–93 Fienberg S, Tanur J 1995 Reconsidering Neyman on experimentation and sampling: Controversies and fundamental contributions. Probability and Mathematical Statistics 15: 47–60 Fienberg S, Tanur J 1996 Reconsidering the fundamental contributions of Fisher and Neyman on experimentation and sampling. International Statistical Reiew 64: 237–53 Gini C 1928 Une application de la me! thode repre! sentative aux mate! riaux du dernier recensement de la population italienne (1er de! cembre 1921). Bulletin of the International Statistical Institute 23(2): 198–215 Gini C, Galvani L 1929 Di una applicazione del metodo rappresentative all’ultimo censimento italiano della popolazione (1m Decembri 1921). Annali di Statistica 6(4): 1–107 Gram J 1883 About calculation of the mass of a forest cover by means of test trees (in Danish). Tidsskrft for Skobrug 6: 137–98 Hansen M, Dalenius T, Tepping B 1985 The development of sample surveys of finite populations. In: Atkinson A, Fienberg S (eds.) A Celebration of Statistics: The ISI Centenary Volume. Springer Verlag, New York, pp. 327–54 Hansen M, Hurwitz W 1942 Relative efficiencies of various sampling units in population inquiries. Journal of the American Statistical Association 37: 89–94 Hansen M, Hurwitz W 1943 On the theory of sampling from finite populations. Annals of Mathematical Statistics 14: 333–62 Hansen M, Hurwitz W, Madow W 1953a Sample Surey Methods and Theory, Vol. 1. Wiley, New York Hansen M, Hurwitz W, Madow W 1953b Sample Surey Methods and Theory, Vol. 2. Wiley, New York Hansen M, Hurwitz W, Marks E, Mauldin W 1951 Response errors in surveys. Journal of the American Statistical Association 46: 147–90 Horvitz D, Thompson D 1952 A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association 47: 663–85 Jenson A 1926 Report on the representative method in statistics. Bulletin of the International Statistical Institute 22: 359–80 Kiaer A 1895\1896 Observation et expe! riences concernant des de! nombrements re! presentatifs. Bulletin of the International Statistical Institute 9(2): 176–83 Kiaer A 1897 The representative method of statistical surveys (in Norwegian). Christiana Videnskabsselskabets Skrifter. II Hisoriskfilosofiske 4 [reprinted with an English trans. by Central Bureau of Statistics of Norway, Oslo, 1976)] Kruskal W, Mosteller F 1980 Representative sampling, iv: The history of the concept in statistics, 1895–1939. International Statistical Reiew 48: 169–95 Laplace P 1814 Essai philosophique sur les probabiliteT s. Dover, New York [trans. of 1840 6th edn. appeared in 1951 as A Philosophical Essay on Probabilities, trans. Truscott F W, Emory F L] Madansky A 1986 On biblical censuses. Journal of Official Statistics 2: 561–69 Mahalanobis P 1944 On large-scale sample surveys. Philosophical Transactions of the Royal Society of London 231(B): 329–451 Mahalanobis P 1946 Recent experiments in statistical sampling in the Indian Statistical Institute. Journal of the Royal Statistical Society 109: 325–78
13457
Sample Sureys, History of Mosteller F, Hyman H, McCarthy P J, Marks E S, Truman D B 1949 The Pre-election Polls of 1948. Social Science Research Council, New York Neyman J 1934 On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society 97: 558–625 Neyman J 1952 Lectures and Conferences on Mathematical Statistics and Probability. US Department of Agriculture, Washington, DC [The 1952 edition is an expanded and revised version of the original 1938 mimeographed edition.] Rao J, Bellhouse D 1990 The history and development of the theoretical foundations of survey based estimation and statistical analysis. Surey Methodology 16: 3–29 Sa$ rndal C, Swensson B, Wretman J 1992 Model Assisted Surey Sampling. Springer Verlag, New York Seneta E 1985 A sketch of the history of survey sampling in Russia. Journal of the Royal Statistical Society Series A 148: 118–25 Seng Y 1951 Historical survey of the development of sampling theories and practice. Journal of the Royal Statistical Society, Series A 114: 214–31 Splawa-Neyman J 1923 On the application of probability theory to agricultural experiments. Essay on principles (in Polish). Roczniki Nauk Rolniczych Tom (Annals of Agricultural Sciences) X: 1–51 [Sect. 9, translated and edited by D. M. Dabrowska and T. P. Speed, appeared with discussion in Statistical Science (1990) 5: 463–80] Splawa-Neyman J 1925 Contributions of the theory of small samples drawn from a finite population. Biometrika 17: 472–9 [The note on this republication reads ‘These results with others were originally published in La Reue Mensuelle de Statistique, publ. Par l’Office Central de Statistique de la Re! publique Polonaise 1923 tom. vi, 1–29] Stephan F 1948 History of the use of modern sampling procedures. Journal of the American Statistical Association 43: 12–39 Stigler S M 1986 The History of Statistics: The Measurement of Uncertainty before 1900. Harvard University Press, Cambridge, MA Sudman S 1967 Reducing the Costs of Sureys. Aldine, Chicago Taeuber C 1978 Census. In: Kruskal W H, Tanur J M (eds.) International Encyclopedia of Statistics. Macmillan and the Free Press, New York, pp. 42–6 Tchouproff A A 1918a On the mathematical expectation of the moments of frequency distributions (Chap. 1 and 2). Biometrika 12: 140–69 Tchouproff A A 1918b On the mathematical expectation of the moments of frequency distributions (Chap. 3 and 4). Biometrika 12: 185–210 Tschuprow A A 1923a On the mathematical expectation of the moments of frequency distributions in the case of correlated observations (Chaps. i–iii). Metron 2: 461–93 Tschuprow A A 1923b On the mathematical expectation of the moments of frequency distributions in the case of correlated observations (Chaps. iv–vi). Metron 2: 646–80 Willcox W 1930 Census. In: Seligman E R A, Johnson A (eds.) Encyclopedia of Social Sciences, Vol. 3. Macmillan, New York pp. 295–300 Yates F 1946 A review of recent developments in sampling and sample surveys (with discussion). Journal of the Royal Statistical Society 109: 12–42 Zarkovic S S 1956 Note on the history of sampling methods in Russia. Journal of the Royal Statistical Society, Series A 119: 336–8
13458
Zarkovich S S 1962 A supplement to ‘note on the history of sampling methods in Russia.’Journal of the Royal Statistical Society, Series A 125: 580–2
S. E. Fienberg and J. M. Tanur
Sample Surveys: Methods A survey consists of a number of operations or steps of survey design. The ultimate goal of the survey is to generate estimates of population parameters, based on observations of (a sample of) the units that comprise the population. The design steps can be viewed as a chain of links. The chain is no stronger than its weakest link, for example, see Surey Sampling: The Field. The steps can be labeled in the following way: (a) Research objectives are defined, i.e., the subject-matter problem is translated into a statistical problem. Researchers must define the target population they want to study and the concepts they wish to measure. Indicators (variables) of the concepts are chosen and eventually, questions are formulated. Problems in this first step usually result in relevance errors, see Hox (1997). (b) A frame of the population is developed. The frame could be a list of population units, a map, or even a set of random numbers that could be used to access a population of individuals with telephones. Coverage errors result when frame units do not completely correspond with population units. (See Groves 1989.) (c) The mode of administrating the survey is chosen. The suitable survey mode depends on budget constraints, topic, and the type of measurement that is being considered. Common modes include faceto-face interviews, telephone interviews, diaries, and administrative records. New modes related to the Internet have recently entered the scene, see Lyberg and Kasprzyk (1991) and Dillman (2000). (d) The questionnaire is developed. For each survey construct, one or more survey questions are developed. Questionnaire development is complicated since the perception of questions varies among respondents and is sensitive to a number of cognitive phenomena. Effects that are treated in the literature include: question wording, question order, order between response alternatives, context, navigational principles, and the influence of interviewers. Many large survey organizations emphasize this step and have created cognitive laboratories to improve their questionnaire work, see Sudman et al. (1996) and Forsyth and Lessler (1991). (e) The sampling design specifies the sampling unit, the method for randomly selecting the sample from the frame, and the sample size. The choice depends on the assumed population variability and the costs of sampling different units, see Sa$ rndal et al. (1992). (f)
Sample Sureys: Methods Data are collected, i.e., observations or measurements are made of the sampled units. Typically both random and systematic errors occur at this stage, see de Leeuw and Collins (1997). (g) Data are processed. This step is needed to make estimation and tabulation possible, given that only raw data exist. The processing includes editing activities where data consistency and completeness are controlled, entry of data (data capture), and coding of variables, see Lyberg and Kasprzyk(1997). All design steps are interdependent. A decision regarding one step has an impact on other steps. Thus, several iterations of the above steps of the design process may be necessary before the final design is determined. There follows details of the survey process.
1. Populations and Population Parameters A population is a set of units. For example, a population might be the inhabitants of a country, the households in a city, or the companies in a sector of industry. The ‘units of study’ are the units used as a basis of the analysis, for example individuals, households, or companies. The ‘target population’ is that particular part of the population about which inference is desired. The ‘study variables’ are the values of the units studied, for example the age of an individual, the disposable income of a household, or the number of employees in a company. The ‘population parameters’ are characteristics of the population, summarizing features of the population, for example the average disposable income for households in a population. Most often interest centers on finding the population parameters for specific subsets of the population such as geographical regions or age groups. The subsets are called ‘domains of study.’
2. Sampling Units, Frames, and Probability Samples Frequently, statements about some population parameters are problematic because of a lack of time and funding necessary for surveying all units in the population. Instead we have to draw a sample, a subset of the population, and base conclusions on that sample. High quality conclusions concerning the parameter values require care when designing the sample. Assessing the quality of the conclusions made can be accomplished by using a probability sampling design, which is a design such that all units in the population is given a known probability of being included in the sample. Given that knowledge, it is possible to produce estimates of the parameters that are unbiased and of high precision. Moreover it is also possible to estimate the precision of the estimates based on the sample. This is usually in contrast to nonprobability sampling designs where, strictly speak-
ing, the statistical properties of the estimators are unknown unless one is willing to accept distribution assumptions for the variables under study in combination with the sampling design. Probability sampling requires a frame of sampling units. The units of study in the population can be of different types, such as individuals, households, or companies. Even if the analysis is intended to be based on these units it is often practical or even necessary to group the population units into units that are better suited for sampling. The reason for this can be that, due to imperfect frames, there is no register of the units of study and therefore the units cannot be selected directly. Instead, groups of units can be selected, for example, villages or city blocks, and interviews are conducted with the people in the villages or blocks. Here the sampling unit consists of a cluster of units of study. In many countries, the demarcation of the clusters are based on geographical units identified through areas on maps. Often the clusters are formed to be a suitable workload for an interviewer or enumerator and are called the enumeration area. The enumeration areas are often formed and updated in population censuses. In countries where the telecommunication system is well developed, interviews are often done by telephone. The sampling unit in this case is the cluster of individuals, i.e., the household members that have access to the same telephone. The frame can take different shapes. For example, a list of N students at a university complete with names and addresses is a frame of students. In this case the sampling units and the units of study are the same. This makes it easy to select a sample of n students by numbering the students from 1 to N and generating n random numbers in a computer. Then we select the students that have the same number as the realized random numbers. Note that in doing so we deliberately introduce randomization through the random number generation. This is in contrast to methods that do not use randomization techniques but act as if data are independent identically distributed random variables regardless of how the data was obtained. Often, the sampling units are not the same as the study units. If, for example, the units of study are individuals in a country and the target population consists of all individuals living in the country at a specific date, there is seldom a list or data file that contains the individuals. Also the cost of making such a list would be prohibitive. Using a sequence of frames containing a sequence of sampling units of decreasing sizes can improve this situation. For example, if there exists a list of villages in the country, then this is a frame of primary sampling units. From a sample of villages, a list of all households within the villages selected can be made. This is a frame of secondary sampling units. Note that there is a need for making the list of households only for the villages selected. Using this list, some of the households may be selected for interviews. Thus, a sequence 13459
Sample Sureys: Methods of frames containing a sequence of sampling units has been constructed in a way that permits selection of a probability sample of units of study without enumerating all the individuals in the country. The existence of, or the possibility of, creating reliable frames is a prerequisite for making efficient sampling designs. Often the frame situation will govern the sampling procedure. Ideally the frame should cover the target population and only that population. It should provide the information necessary for contacting the units of study such as names, addresses, telephone numbers, or the like. If there is auxiliary information about the units, it is very practical if it is included in the frame. Auxiliary information is useful for constructing efficient sampling designs. A specific type of frame is used in ‘Random Digit Dialing’ (RDD). When telephone numbers are selected, a common technique is to select numbers by generating them at random in a computer. The frame is the set of numbers that could be generated in the computer. The selection probability of the cluster is proportional to the number of telephones to which the household members have access.
3. Sampling Methods and Estimators The overall goal is to be able to make (high) quality estimates of the parameters as cheaply as possible. This is implemented using a ‘sampling strategy,’ i.e., a combination of sampling design (a certain combination of sampling methods) and an ‘estimator,’ which is as efficient as possible, i.e., gives the highest precision for a given cost. There exists a variety of sampling methods that can be applied in different situations depending on the circumstances. The most frequently employed sampling methods are: (a) Simple Random Sampling (SRS). Every unit in the population is given an equal chance of being included in the sample. (b) Systematic Sampling. Every kth unit in the frame is included in the sample, starting from a randomly selected starting point. The sampling step k is chosen so that the sample has a predetermined size. (c) Stratified Sampling. The units are grouped together in homogenous groups (strata) and then sampled, for example, using SRS within each group. Stratified sampling is used either for efficiency reasons or to ensure a certain sample size in some domains of study, such as geographical regions or specific age groups. It is necessary to know the values of a stratification variable in advance to be able to create the strata. Stratification is efficient if the stratification variable is related to the variables under study. If the number of units that are being selected within each stratum is proportional to the number of units in the stratum, then this is called proportional allocation. Usually this will give fairly efficient estimates. It can be improved upon using optimal allocation, which requires knowledge of the variability of the study 13460
variables or correlated variables within each stratum. In many cases the improvement is small compared to proportional stratification. (d) Unequal Probability Sampling. This is a method, which can be employed either for efficiency reasons or for sheer necessity. Often the probability of selecting a cluster of units is (or is made to be) proportional to the number of units in the cluster. This is called selecting with probability proportional to size (PPS), with size in this case equal to the number of units. The measure of size can vary. The efficiency of the estimator will increase if the measure of size is related to the study variables. (e) Multistage Sampling. This is a technique, which is typically used in connection with frame problems. The above-mentioned sampling methods can be combined to produce good sampling designs. For example, the following is an example of a ‘master sample design’ that is used in some countries. Suppose that the object is to conduct a series of sample surveys concerning employment status, living conditions, and household expenditures. Also suppose that based on a previous census there exist enumeration areas (EAs) for the country, which are reasonably well updated so that approximations of the population sizes are available for each EA. Suppose that it is deemed appropriate to stratify the EAs according to geographical regions both because the government wants these regions as domains of study and because there is reason to believe that the consumption pattern is different within each region. Thus, the stratification of the EAs would serve both purposes, namely controlling the sample sizes for domains of study and presumably being more efficient than SRS. A number of EAs are selected within each stratum. The number of EAs to be selected is chosen to be proportional to the aggregated population figures in the regions according to the most recent census. The EAs are the primary sampling units. If the EAs vary a lot in size it could be efficient to select the EAs with PPS; otherwise SRS could be used. The EAs so selected, constitute the master sample of primary sampling units that will be kept the same for a number of years. The actual number of years will depend on the migration rate within the regions. From this master sample a number of households are selected each time a new survey is to be conducted. When selecting the households, it is possible to make a new updating of the EAs, i.e., to make a new list of the households living in the area and then select with systematic sampling or SRS a number of households to be interviewed. The households are the secondary sampling units. Often the sampling fractions in the second stage are determined in such a way that the resulting inflation factor for the households is the same. This is called a ‘self-weighting design.’ In principle, a self-weighting design makes it possible to calculate the estimates and the precision of the estimates without a computer.
Sample Sureys: Methods
4. Estimators, Auxiliary Information, and Sample Weights To each sampling design there corresponds at least one estimator, which is a function of the sample data used for making statements about the parameters. The form of the estimator depends on the sampling design used and also on whether auxiliary information is included in the function. In survey sampling, some auxiliary information is almost always available, i.e., values of concomitant variables known for all units in the population. If this information is correlated to the study variables, it can be used to increase the efficiency of the sampling strategy either by incorporating the information in the sampling method, as in stratification, or in the probability of including the unit in the sample as in PPS, or in adjusting the estimator. A very general form of an estimator is the so-called generalized regression estimator (Cassel et al. 1976), which for estimating the population total takes the form tGR l tyjβV (txkTx) where x denotes the values of the auxiliary variable, Tx the known value of the population total of x, β# is the regression coefficient estimated from the sample between x and the study variable y and n x tx l i α " i
is the so-called Horvitz–Thompson estimator of X, and αi the inclusion probability of unit i. The function ty is the estimator of the unknown value of the parameter. The motivation for using this estimator is based on the conviction that the study variable and the auxiliary variable are linearly related. Thus, this can be seen as one example of the use of models in survey sampling. If the inclusion probabilities are the same for all units, i.e., if αi l (n\N ), then the Horvitz– Thompson estimator becomes the expanded sample mean, which is an unbiased estimator of the population total under the SRS scheme if the expectation is taken with respect to the design properties. The sample weights are the inflation factors used in the estimator for inflating the sample data. They are usually a function of the inverted values of the inclusion probabilities. In the Horvitz–Thompson estimator, the sample weights are N(αin)−". For the sample mean, the sample weights become N\n. As was noted earlier, to each sampling design, there exists an estimator that is natural, i.e., it is unbiased in the design sense. For example, in the case of multistage probability sampling, the sampling weights are functions of the different selection probabilities in each selection step. If the selection probabilities are carefully selected they may form a weight, which is constant for all units in the sample, and the value taken by the estimator can be calculated simply by summing the values of the
units in the sample and multiplying the sum by the constant. This is called a ‘selfweighted estimator’ (and design). This technique was important before the breakthrough of the personal computer, because it saved a lot of manual labor but with today’s powerful PCs, it has lost its merits. The generalized regression estimator can be rewritten in the form: n
tGR l wixi " where wi are the sample weights pertinent to unit i for this estimator. The ‘calibration estimator’ is the form of the regression estimator where the sample weights have been adjusted so that tx Tx. This adjustment causes a small systematic error, which decreases when the sample size increases.
5. Assessing the Quality of the Estimates—Precision The quality of the statements depend among other things on the precision of the estimates. Probability sampling supports measurement of the sampling error, since a randomly selected subset of the population is a basis for the estimates. The calculation takes into consideration the randomization induced by the sampling design. Because the design might be complex, the calculation of the precision might be different from what is traditionally used in statistics. For example the variance of the sample mean y- s for estimating the population mean y- under SRS with replacement is S #\n where S # is the population variance, where S# l
1 N ( yiky` )# Nk1 "
and n is the sample size. This formula is similar to that used in traditional statistical analysis. However for the Horvitz–Thompson estimator, the variance is: 1N N 2 " " F
E
αiαjkαij αij
G
E
H
F
yj yi k αj αi
# G
H
where αij is the joint inclusion probability of units i and j. As can be seen, the calculation of the variance becomes complicated for complicated sampling designs, having a large number of selection stages and also unequal selection probabilities in each step. This complexity becomes even more pronounced when estimating population parameters in subgroups where the means are ratios of random variables. However some shortcut methods have been developed: ultimate cluster techniques, Taylor series, jack-knifing, and replications, see Wolter (1985). It is also evident from the formula for the variance that the variance of the 13461
Sample Sureys: Methods Horvitz–Thompson estimator depends on the relation between the values of the study variable y and the inclusion probability α. If the inclusion probability can be made to be approximately proportional to the values of the study variable, then the variance becomes small. That is one reason why much emphasis is put on the existence of auxiliary information in survey sampling. However, as was shown by Godambe (1955) there does not exist a uniformly minimum variance unbiased estimator in survey sampling when inference is restricted to design-based inference.
One should bear in mind, however, that for each survey step there are methods designed to keep the errors small. Systematic use of such known dependable methods decreases the need for evaluation studies or heavy reliance on modeling. See also: Sample Surveys: Nonprobability Sampling; Sample Surveys: Survey Design Issues and Strategies
Bibliography 6. Assessing the Quality of the Estimates—Total Surey Error The total survey error can be measured by the mean squared error (MSE), which is the sum of the variance and the squared bias of the estimate. Regular formulas for error estimation do not take all MSE components into account but rather the precision components mentioned above. Basically, a variance formula includes random variation induced by the sampling procedure and the random response variation. Components such as correlated variances induced by interviewers, coders, editors, and others have to be estimated separately and added to the sampling and response variance. The same goes for systematic errors that contribute to the bias. They have to be estimated separately and added, so that a proper estimate of MSE can be obtained. Often the survey quality is visualized by a confidence interval, based on the estimated precision. Obviously such an interval might be too short because it has not taken the total error into account. Some of the sources of correlated variances and biases can be traced to: (a) respondents who might have a tendency to, for instance, under-report socially undesirable behaviors; (b) interviewers who might systematically reformulate certain questions and do so in an interviewerspecific fashion; (c) respondents who do not participate because they cannot be contacted or they refuse; (d) incomplete frames resulting in, e.g., undercoverage; and (e) coders who might introduce biased measurements if they tend to erroneously code certain variable descriptions. There are of course many other possibilities for error to occur in the estimates. There are basically two ways of assessing the quality of estimates: (a) The components of MSE can be estimated—this is a costly and time-consuming operation and (b) by a modeling approach—it might be possible to include, e.g., nonresponse errors and coverage errors in the assessment formulas. This method’s success depends on the realism of the modeling of the error mechanisms involved. 13462
Cassel C M, Sa$ rndal C E, Wretman J H 1976 Some results on generalized difference and generalized regression estimation for finite populations. Biometrika 63: 615–20 De Leeuw E, Collins M 1997 Data collection methods and survey quality: An overview. In: Lyberg L et al. (eds.) Surey Measurement and Process Quality. Wiley, New York Dillman D 2000 Mail and Internet Sureys. Wiley, New York Godambe V P 1955 A unified theory of sampling from finite populations. Journal of Royal Statistical Society, Series B 17: 269–78 Groves R 1989 Surey Errors and Surey Costs. Wiley Forsyth B, Lessler J 1991 Cognitive laboratory methods: A taxomomy. In: Biemer P et al. (eds.) Measurement Errors in Sureys. Wiley, New York Hox J 1997 From theoretical concept to survey question. In: Lyberg L et al. (eds.) Surey Measurement and Process Quality. Wiley, New York, pp. 47–70 Lyberg L, Kasprzyk D 1991 Data collection methods and measurement error: An overview. In: Biemer P et al. (eds.) Measurement Errors in Sureys. Wiley Lyberg L, Kasprzyk D 1997 Some aspects of post-survey processing. In: Lyberg L et al. (eds.) Surey Measurement and Process Quality. Wiley Sa$ rndal C E, Swensson B, Wretman J 1992 Model Assisted Surey Sampling. Springer-Verlag, New York Sudman S, Bradburn N, Schwarz N 1996 Thinking About Answers: The Application of Cognitie Processes to Surey Methodology. Jossey-Bass, San Francisco Wolter K 1985 Introduction to Variance Estimation. SpringerVerlag
C. M. Cassel and L. Lyberg
Sample Surveys: Model-based Approaches The theory for sample surveys has been developed using two theoretical frameworks: design-based and model-based. The design-based approach uses the probabilities with which units are selected for the sample for inference; in the model-based approach, the investigator hypothesizes a joint probability distribution for elements in the finite population, and uses that probability distribution for inference. In this article, the two approaches are described and compared, and guidelines are given for when, and how,
Sample Sureys: Model-based Approaches one should perform a model-based analysis of survey data. The use of models for survey design is discussed briefly.
1. Inference in Sample Sureys How does one generalize from individuals in a sample to those not observed? The problem of induction from a sample was much debated by philosophers, social scientists, and mathematicians of the eighteenth and nineteenth centuries, including Immanuel Kant, Charles Peirce, John Venn, and Adolphe Quetelet. In the early years of the twentieth century, many investigators resisted the idea of using survey samples rather than censuses for ‘serious statistics’ because of inference issues (see Sample Sureys, History of). The debates involving official uses of sample surveys in these years resulted in the development of two philosophical frameworks for inference from a sample: design-based inference and model-based inference. In design-based inference, first expounded systematically by Neyman (1934), the sample design provides the mechanism for inferences about the population. Suppose that a without-replacement probability sample (see Sample Sureys: Methods) of n units is to be taken from a population of N units. A random variable Zi is associated with the ith unit in the population; Zi l 1 if the unit is selected for inclusion in the sample, and Zi l 0 if the unit is not selected. The joint probability distribution of oZ ,…, ZNq is used for " inference statements such as confidence intervals (see Estimation: Point and Interal ). The quantity being measured on unit i, yi, is irrelevant for inference in the design-based approach. Whether yi is household income, years of piano lessons, or number of cockroaches in the kitchen, properties of estimators depend exclusively on properties of the random variables oZ , …, ZNq that describe " the probability sampling design. The Horvitz– Thompson (1952) estimator, N
Zi yi\πi i="
(1)
where πi l P(Zi l 1), is an unbiased estimator of the population total N yi; the variance of the Horvitz– i= Thompson estimator," N N
yi yj(πiπj)-" Cov (Zi, Zj) i="j=" depends on the covariance structure of oZ , …, ZNq. " the inThe design-based approach differs from ferential framework used in most other areas of statistics. There, yi is the observed value of a random variable Yi; the joint probability distribution of the Yi’s and proposed stochastic models allow inferential
statements to be made. Following questions raised by Godambe (1955) about optimal estimation and survey inference, Brewer (1963) and Royall (1970) suggested that the same model-based frameworks used in other areas of statistics also be used in finite population sampling. Thompson (1997) summarized various approaches to prediction using models. In the model-based approach to inference, the finite population values are assumed to be generated from a stochastic model. Regression models (see Linear Hypothesis: Regression (Basics)) are often adopted for this purpose: if covariates xi , xi , …, xip are known " a#possible model is for every unit in the population, Yi l β jβ xi j(jβpxipjεi ! " "
(2)
where the εi’s are random variables with mean zero and specified covariance structure. The parameters βj for j l 0, …, p may be estimated by standard techniques such as generalized least squares, and the regression equation used to predict the value of y for units not in the sample. Then the finite population total is estimated by summing the observed values of the yi’s for units in the sample and the predicted values of the response for units not in the sample. To illustrate the two approaches, consider a hypothetical survey taken to study mental health status in a population of 10,000 women over age 65. A stratified random sample of 100 urban women and 100 rural women is drawn from a population of 6,000 urban women and 4,000 rural women over age 65. One response of interest is the score on a depression inventory. Let y` U denote the sample mean score for the urban women and let y` R denote the sample mean score for the rural women. Under the design-based approach as exposited in Cochran (1977), πi l 1\60 if person i is in an urban area and πi l 1\40 if person i is in a rural area. Every urban woman in the sample represents herself and 59 other urban women who are in the population but not in the sample; every sampled rural woman represents herself plus 39 other rural women. The Horvitz– Thompson estimator for the mean depression score for the population is N−"N Ziyi\πi; here, the i= estimated population mean score" under the stratified random sampling design is 0.6y` Uj0.4y` R. A 95 percent confidence interval (CI) for the mean refers to the finite population: if a CI were calculated for each possible sample that could be generated using the sampling design, the selection probabilities of the samples whose CIs include the true value of the population mean depression score sum to 0.95. Inference refers to repeated sampling from the population, not to the particular sample drawn. Any number of stochastic models might be considered in a model-based approach. Consider first a special case of the regression model in Eqn. (2), and assume that the εi’s are independent and normally 13463
Sample Sureys: Model-based Approaches distributed with constant variance. Model 1 has the form Yi l β jβ xijεi, where xi l 1 if person i is an ! and " x l 0 if person i is a rural resident. urban resident i The stochastic model provides a link between sampled and unsampled units: women who are not in the sample are assumed to have the same mean depression score as women with the same urban\rural status who are in the sample. Under Model 1, β# l y` R and β# l ! of β and" β . y` Uky` R are the least squares estimates ! " Thus the predicted value of y for each urban woman is y` U, and the predicted value of y for each rural woman is y` R. The model-based estimate of mean depression in this population is thus 1 [100y` Uj100y` Rj5,900y` Uj3,900y` R] 10,000 l 0.6y` Uj0.4y` R With Model 1, the point and interval estimates of the finite population mean are the same as for the designbased approach. The 95 percent confidence interval is interpreted differently, however; 95 percent of CIs from samples with the same values of xi that could be generated from the model are expected to contain the true value of β j0.6β . ! model-based " In this example, inference with Model 1 accords with the results from design-based inference. But suppose that the model adopted is Model 2: Yi l µjεi. Then the predicted value of the response for all women in the finite population is y` l ( y` Ujy` R)\2, and the mean depression score for the finite population is also estimated by y` . If depression is higher among rural women than among urban women, the estimate y` from Model 2 will likely overestimate the true mean depression score in the population of 10,000 women because rural women are not proportionately represented in the sample. In the model-based approach, inference is not limited to the 10,000 persons from whom the sample is drawn but applies to any set of persons for whom the model is appropriate. Random selection of units is not required for inference as long as the model assumptions hold. If the model is comprehensive enough to include all important information, as in Model 1 above, then the sampling design is irrelevant and can be disregarded for inference. The two types of inference differ in the conception of randomness. Design-based inference relies on the randomness involved in sample selection; it relates to the actual finite population, and additional assumptions are needed to extend the inferences to other possible populations. Model-based inference is conditional on the selected sample; the randomness is built into the model, and the model assumptions are used to make inferences about units not in the sample. Design-based inference depends on other possible samples that could have been selected from the finite population but were not, while model-based inference 13464
depends on other possible populations that could have been generated under the model but were not. The approaches are not completely separated in practice. Rao (1997) summarized a conditional designbased approach to inference, in which inference is restricted to a subset of possible samples. Sa$ rndal et al. (1992) advocated a model-assisted approach, in which a population model inspires the choice of estimator, but inference is based on the sampling design. In the depression example, a model-assisted estimator could incorporate auxiliary information such as race, ethnicity, and marital status through a model of the form in Eqn. (2); however, the stratified random sampling design is used to calculate estimates and standard errors.
2. Models in Descriptie and Analytic Uses of Sureys Estimating a finite population mean or total is an example of a descriptive use of a survey: the characteristics of a particular finite population are of interest. In much social science research, survey data are used for analytic purposes: investigating relationships between factors and testing sociological theories. Data from the US National Crime Victimization Survey may be used to estimate the robbery rate in 1999 (descriptive), or they may be used to investigate a hypothesized relationship between routine activities and likelihood of victimization (analytic). In the former case, the population of inference is definite and conceivably measurable through a census. In the latter, the population of inference is conceptual; the investigator may well be interested in predicting the likelihood of victimization of a future person with given demographic and routine activity variables. Smith (1994) argued that design-based inference is the appropriate paradigm for official descriptive statistics based on probability samples. Part of his justification for this position was the work of Hansen et al. (1983), who provided an example in which small deviations from an assumed model led to large biases in inference. Brewer (1999) summarized work on design-based and model-based estimation for estimating population totals and concluded that a modelassisted generalized regression estimator (see Sa$ rndal et al. 1992), used with design-based inference, captures the best features of both approaches. Models must of course always be used for inference in nonprobability samples (see Sample Sureys: Nonprobability Sampling); they may also be desirable in probability samples that are too small to allow the central limit theorem to be applied for inference. Lohr (1999 Chap. 11) distinguished between obtaining descriptive official statistics and uncovering a ‘universal truth’ in an analytic use of a survey. Returning to the depression example, the investigator might be interested in the relationship between de-
Sample Sureys: Model-based Approaches pression score ( y) and variables such as marital status, financial resources, number of chronic health problems, and ability to care for oneself. In this case, the investigator would be interested in testing a theory that would be assumed to hold not just for the particular population of 10,000 women but for other populations as well, and should be making inferential statements about the βs in model (2). The quantity for inference in the design-based setting is bp, the least squares estimate of β that would be obtained if the xis and yi were known for all 10,000 persons in the finite population. The quantity bp would rarely be of primary interest to the investigator, though, since it is merely a summary statistic for this particular finite population. In social research, models are generally motivated by theories, and a model-based analysis allows these theories to be tested empirically. The generalized least squares estimator of β, β# LS, would be the estimator of choice under a pure modelbased approach because of its optimality properties under the proposed model. This estimator is, however, sensitive to model misspecification. An alternative, which achieves a degree of robustness to the model at the expense of a possibly higher variance, is to use the design-based estimator of bp. If the proposed stochastic model is indeed generating the finite population and if certain regularity conditions are met, an estimator that is consistent for estimating bp will also be consistent for estimating β. Under this scenario, a design-based estimate for bp also estimates the quantity of primary interest β and has the advantage of being less sensitive to model misspecification. Regardless of philosophical differences on other matters of inference, it is generally agreed that two aspects of descriptive statistics require the use of models. All methods currently used to adjust for nonresponse (see Nonsampling Errors) employ models to relate nonrespondents to respondents, although the models are not necessarily testable. In small area estimation, sample sizes in some subpopulations of interest are too small to allow estimates of sufficient precision; models are used to relate such subpopulations to similar subpopulations and to useful covariates.
of the area or the population, but to the fact that the sample size in the domain is small or may even be zero. Consider the states to be the small areas, and let yk be the proportion of school-age children who are poor in state k. The direct estimate y` k of yk is calculated using data exclusively from the CPS, and Vp (ya k) is an estimate of the variance of y` k. Since in some states Vp ( y` k) is unacceptably large, the current practice for estimating poverty at the state level (see National Research Council, 2000, p. 49) uses auxiliary information from tax returns, food stamp programs, and the decennial census to supplement the data from the CPS. A regression model for predicting yk using auxiliary information gives predicted values
3. Models for Small Area Estimation
4.
In small area estimation, a model is used to estimate the response in subpopulations with few or no sample observations. As an example, the US Current Population Survey (CPS) provides accurate statistics about income and poverty for the nation as a whole. It was not designed, though, to provide accurate estimates in domains such as states, counties, or school districts—the sample would have to be prohibitively large in order to provide precise estimates of poverty for every county in the USA. These domains are called small areas—the term ‘small’ does not refer to the size
The first step in a model-based analysis for either descriptive or analytical use is to propose and fit a model to the data. Dependence among units, such as dependence among children in the same school, can be treated using hierarchical linear models or other methods discussed in Skinner et al. (1989). The biggest concern in a model-based analysis, as pointed out by Hansen et al. (1983), is that the model may be misspecified. Many, but not all, of the assumptions implicit in a model can be checked using the sample data. Appropriate plots of the data provide
% yV k l βV j βV j xjk ! j=" where the xjk s represent covariates for state k (e.g., x k " is the proportion of child exemptions reported by families in poverty in state k, and x k is the proportion of people receiving food stamps# in state k). The predicted value yV k from the regression equation is combined with the direct estimate y` k from the CPS according to the relative amounts of information present in each: the small area estimate for state k is yg k l γky` kj(1kγk)yV k where γk is determined by the relative precision of y` k and yV k. If the direct estimate is precise for a state, i.e., Vp ( y` k) is small, then γk is close to one and the small area estimate yg k relies mostly on the direct estimate. Conversely, if the CPS contains little information about state k’s poverty rate, then γk is close to zero and yg k relies mostly on the predicted value from the regression model. The small area model allows the estimator for area k to ‘borrow strength’ from other areas and incorporate auxiliary information from administrative data or other sources. Ghosh and Rao (1994) and Rao (1999) review properties of this model and other models used in small area estimation.
Performing a Model-based Analysis
13465
Sample Sureys: Model-based Approaches some graphical checks of model adequacy and correctness of the assumed variance structure, as described in Lohr (1999). These assumptions can also be partially checked by performing hypothesis tests of nested models, and by fitting alternative models to the data. In the depression example, plotting the data separately for rural and urban residents would reveal inadequacy of Model 2 relative to Model 1. Another method that can sometimes detect model inadequacy is comparison of design-based and modelbased estimates of model parameters. As mentioned in Sect. 2, if the model is correct for units in the finite population, then the design-based estimates and the model-based estimates should both be consistent for the model parameters. A substantial difference in the estimates could indicate that the sample design contains information not captured in the model, and that perhaps more covariates are needed in the model. One crucial assumption that cannot be checked using sample data is that the model describes units not n the sample. This assumption is especially important in nonprobability samples and in use of models for nonresponse adjustment or small area estimation.
5. Models in Surey Design Kalton (1983) distinguished between the use of models in survey analysis and in survey design, stating that ‘the use of models to guide the choice of sample design is well-established and noncontroversial.’ In good survey practice, a stratified sampling design is often chosen because it is thought that there are differences among stratum means. An unequal probability design may be employed because of prior belief that large counties have more variability in total number of crime victimizations than small counties; models provide a mechanism for formalizing some of the knowledge about population structure and exploring results of alternative assumptions. Cochran (1977) illustrated the use of models for designing systematic samples. Sa$ rndal et al. (1992, Chap. 12) summarized research on optimal survey design, in which auxiliary information about the population is used to select a design that minimizes the anticipated variance of an estimator under the model and design. Models used for design purposes do not affect the validity of estimates in design-based inference. A poor model adopted while designing a probability sample may lead to larger variance of design-based estimates, but the estimates will retain properties such as unbiasedness under repeated sampling. A good model at the design stage often leads to a design with greatly increased efficiency. A model-based analysis can be conducted on data from any sample, probability or non-probability. Probability sampling is in theory unnecessary from a pure model-based perspective, and Thompson (1997) and Brewer (1999) concluded that certain forms of 13466
purposive non-probability sampling can be superior to probability sampling when a model-based analysis is to be conducted and the model is correct. In practice, however, there is always concern that the assumed model may miss salient features of the population, and probability sampling provides some protection against this concern. For a ratio model, with Yi l βxijεi and V [εi] l σ#xi, the model-based optimal design specifies a purposive sample of the population units with the largest x values. Such a design does not allow an investigator to check whether the model is appropriate for small xs; an unequal probability sample with πi proportional to xi does allow such model checking, and allows inferences under either design- or modelbased frameworks. As Brewer (1999) pointed out, there is widespread public perception that ‘randomized sampling is fair’ and that perception provides a powerful argument for using probability sampling for official statistics. The following sources are useful for further exploration of modes of inference in sample surveys. Lohr (1999, Chap. 11) provides a more detailed heuristic discussion of the role of models in survey sampling; Thompson (1997) gives a more mathematical treatment. The articles by Smith (1994), Rao (1997), and Brewer (1999) contrast inferential philosophies, discuss appropriate use of models in analysis of survey data, and provide additional references. See also: Sample Surveys, History of; Sample Surveys: Methods; Sample Surveys: Survey Design Issues and Strategies; Sample Surveys: The Field
Bibliography Brewer K R W 1963 Ratio estimation and finite populations: some results deducible from the assumption of an underlying stochastic process. Australian Journal of Statistics 5: 93–105 Brewer K R W 1999 Design-based or prediction-based inference? Stratified random vs. stratified balanced sampling. International Statistical Reiew 67: 35–47 Cochran W G 1977 Sampling Techniques, 3rd edn. Wiley, New York Ghosh M, Rao J N K 1994 Small area estimation: An appraisal. Statistical Science 9: 55–76 Godambe V P 1955 A unified theory of sampling from finite populations. Journal of the Royal Statistical Society B 17: 269–78 Hansen M H, Madow W G, Tepping B J 1983 An evaluation of model-dependent and probability-sampling inferences in sample surveys. Journal of the American Statistical Association 78: 776–93 Horvitz D G, Thompson D J 1952 A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association 47: 663–85 Kalton G 1983 Models in the practice of survey sampling. International Statistical Reiew 51: 175–88 Lohr S L 1999 Sampling: Design and Analysis. Duxbury Press, Pacific Grove, CA National Research Council 2000 Small-area Income and Poerty Estimates: Priorities for 2000 and Beyond. Panel on Estimates
Sample Sureys: Nonprobability Sampling of Poverty for Small Geographic Areas, Committee on National Statistics. National Academy Press, Washington, DC Neyman J 1934 On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society 97: 558–606 Rao J N K 1997 Developments in sample survey theory: An appraisal. Canadian Journal of Statistics 25: 1–21 Rao J N K 1999 Some recent advances in model-based small area estimation. Surey Methology 25: 175–86 Royall R M 1970 On finite population sampling theory under certain linear regression models. Biometrika 57: 377–87 Sa$ rndal C E, Swensson B, Wretman J 1992 Model Assisted Surey Sampling. Springer-Verlag, New York Skinner C J, Holt D, Smith T M F 1989 Analysis of Complex Sureys. Wiley, New York Smith T M F 1994 Sample surveys 1975–1990; an age of reconciliation? International Statistical Reiew 62: 5–34 Thompson M E 1997 Theory of Sample Sureys. Chapman & Hall, London
S. L. Lohr
ference cannot be applied to assess the bias or variability of estimators based on nonprobability samples, as such methods do not allow for unknown or zero selection probabilities. Surveys carried out by national statistical agencies invariably use probability sampling. Marsh and Scarborough (1990) also noted ‘the preponderance of probability sampling in university social science.’ Nonprobability sampling is much more common in market and opinion research. However, Taylor (1995 observed large national differences in the extent to which nonprobability sampling, particularly quota sampling, is viewed as an acceptable tool for market research. In Canada and the USA, probability sampling using telephone polling and random-digit dialing is the norm for public opinion surveys. In Australia and South Africa probability sampling is also prevalent, but with face-to-face interviews. On the other hand, in many European countries such as France and the UK, quota sampling is much more common.
1. Conenience Sampling
Sample Surveys: Nonprobability Sampling A sample collected from a finite population is said to be a probability sample if each unit of the population has nonzero probability of being selected into the sample, and that probability is known. Traditional methods of probability sampling include simple and stratified random sampling, and cluster sampling. Conclusions concerning the population may be obtained by design-based, or randomization, inference. See Sample Sureys: The Field and Sample Sureys: Methods. The values of variables of interest in the population are considered as fixed quantities, unknown except for those units selected into the sample. Inference proceeds by considering the behavior of estimators of quantities of interest under the randomization distribution, based on the known selection probabilities. For example, if the N population values of variable Y are denoted Y , …, YN and the n sample values by y , …, yn then y- ," the sample mean, is a " possible estimator for Yz , the population mean. If the sample is obtained by simple random sampling, then, with respect to this randomization distribution, y` is unbiased for Yz and has sampling variance (Nkn)\Nn(Nk1)ΣN (Y kYz )#. i=" i Nonprobability sampling refers to any method of obtaining a sample from a population which does not satisfy the criteria for probability sampling. Nonprobability samples are usually easier and cheaper to collect than probability samples, as the data collector is allowed to exercise some choice as to which units to include in the sample. For a probability sample, this choice is made entirely by the random sampling mechanism. However, methods of design-based in-
The easiest and cheapest way to collect sample data is to collect information on those population units which are most readily accessible. A university researcher may collect data on students. Surveys carried out through newspapers, television broadcasts or Internet sites (as described, for example, by Bradley, 1999) are necessarily restricted to those individuals who have access to the medium in question. Sometimes only a small fraction of the population is accessible, in which case the sample may consist of exactly those units which are available for observation. Some surveys involve an element of self-selection where individuals decide whether to include themselves in the sample or not. If participation is timeconsuming, or financial cost is involved, then the sample is more likely to include individuals with an interest in the subject of the survey. This may not be important. For example, an interest in participating in an experimental study of behavior might be considered to be unlikely to be associated with the outcome of the experiment. However, where the variable of interest relates to opinion on a question of interest, as is often the case in newspaper, television or Internet polls, it is likely that interest in participation is related to opinion, and it is much harder to justify using the sample data to make conclusions about a wider population. A famous example of the failure of such a nonprobability sample to provide accurate inferences about a wider population is the Literary Digest poll of 1936. Ten million US citizens were sent postcard ballots concerning the forthcoming presidential election. Around 2 million of these were returned, a sample size which, if associated with a simple random sample, would be expected to predict the population with negligible error. However, when calibrated against the 13467
Sample Sureys: Nonprobability Sampling election results, the Literary Digest poll was in error by 19 percentage points in predicting Roosevelt’s share of the vote. On the other hand, useful inferences can be made using convenience samples. Smith and Sugden (1988) considered statistical experiments, where the allocation of a particular treatment to the units under investigation is controlled, usually by randomization. In such experiments, the selection of units is not usually controlled and is often a convenience sample. For example, individuals might be volunteers. Nevertheless, inferences are often successfully extended to a wider population. Similarly, obserational studies where neither treatment allocation nor sample selection is controlled, usually because it is impossible to do so, can be thought of as arising from convenience samples. Smith (1983) noted that Doll and Hill (1964) in their landmark study of smoking and health, used a sample entirely made up of medical practitioners. However, the validity of extending conclusions based on their data, to the general population, is now widely recognized. Studies based on convenience samples can be an extremely effective way of conducting preliminary investigations, but it is desirable that any important conclusions drawn about a wider population are further investigated, preferably using probability samples. Where some kind of explanatory, rather than simply descriptive, inference is desired, Smith and Sugden (1988) argued that ‘the ideal studies are experiments within surveys in which the scientist has control over both the selection of units and the allocation of treatments.’ This approach was considered in detail by Fienberg and Tanur (1989).
2. Quota Sampling When using survey data to draw an inference about a population of interest, the hope of the analyst is that sample estimators of quantities of interest are close to the corresponding population values. If a nonprobability sample has been collected, then it is instructive to observe the precision of sample estimators of known population quantities. For example, how do the sample proportions of males and females compare to known population values? If they differ substantially, then the sample is ‘unrepresentative’ of the population and one might have legitimate cause for concern about the reliability of estimates of unknown quantities of interest. Purposie sampling is a term used for methods of choosing a nonprobability sample in a way that makes it ‘representative’ of the population, although there is no generally agreed definition of a representative sample, and purposive sampling is often based on subjective considerations. In quota sampling, the sample selection is constrained to ensure that the sample proportions of certain control variables approximately match the 13468
known population proportions. For example, if the population proportions of males and females are equal, then equal numbers of male and female units are selected into the sample. Age groups are also commonly used in designing quota samples. Sample totals for each cell of a cross-classification of two or more control variables (for example, age by sex) may also be fixed by the design. Examples are given by Moser and Kalton (1971). Quota sampling is most commonly used in market and opinion research, where control variables usually include age, sex, and socioeconomic class. Other variables such as employment status and housing tenure are also used. The known population proportions for the control variables are calculated from census data, or from surveys based on large probability samples. Variables with known population totals which are not used in setting quotas may be used for weighting in any subsequent analyses. Where data collection involves visiting households, further constraints beyond the quotas may be applied to sample selection. For example, data collectors may be assigned a prespecified travel plan. However, where the mode of data collection involves intercepting individuals on the street for interview, then the only constraint on the data collector may be to satisfy the quotas. It is this freedom given to the data collector that provides both the biggest advantage and biggest disadvantage of quota sampling. The advantage is that with only the quota constraints to satisfy, data collection is relatively easy. Such surveys can be carried out rapidly by an individual data collector performing interviews on a busy street corner. As with any nonprobability sampling scheme, however, there is no way of assessing the bias associated with quota sampling. The sample units are necessarily selected from those which are available to the data collector, given their mode of interviewing. If availability is associated with any of the survey variables, then significant bias may occur. Advocates of quota sampling argue that the quotas control for this, but there is no way of guaranteeing that they do. Neither can design-based inference be used to assess the variability of estimates based on quota samples. Sometimes, a simple model is used to assess this variability. If one assumes that the data collectors used are drawn from a population of possible data collectors, then the ‘between collector’ variance combines both sampling variability and interviewer variability. Deville (1991) modeled the quota sampling process and provided some alternative measures of variability. Studies comparing quota and probability sampling have been carried out. Moser and Stuart (1953) discovered apparent availability biases in the quota samples they investigated, with respect to the variables occupation and education. In particular, they noticed that the quota samples underestimated the proportion of population with lower levels of education. Marsh and Scarborough (1990) investigated nine possible sources of availability bias in quota samples. They
Sample Sureys: Nonprobability Sampling found that, amongst women, their quota sample overestimated the proportion from households with children. Both studies found that the quota samples tended to underestimate the proportion of individuals in the extreme (high and low) income groups. Quota samples are often used for political opinion polls preceding elections. In such examples they can be externally validated against the election results and historically quota samples have often been shown to be quite accurate. Indeed Worcester (1996) argued that election forecasts using quota samples for UK elections in the 1970s were more accurate than those using probability samples. Smith (1996) presented similar evidence. However, it is also election forecasting which has led to quota sampling coming under closest scrutiny. In the US presidential election of 1948, the Crossley, Gallup, and Roper polls all underestimated Truman’s share of the vote by at least five percentage points, and as a consequence, predicted the wrong election winner. Mosteller et al. (1949) in their report on the failure of the polls found one of the two main causes of error to be errors of sampling and interviewing, and concluded (p. 304) that ‘it is likely that the principal weakness of the quota control method occurred at the local level at which respondents are selected by interviewers.’ The UK general election of 1992 saw a similar catastrophic failure of the pre-election opinion polls, with pre-election polls giving Labour an average lead of around 1.5 percentage points. In the election, the Conservative lead over Labour was 7 percentage points. A report by the Market Research Society Working Party (1994) into the failure of the polls identified inaccuracies in setting the quota controls as one of a number of possible sources of error. As a result the sample proportions of the key variables did not accurately reflect the proportions in the population. Lynn and Jowell (1996) attributed much of the error to the selection bias inherent in quota sampling, and argued for increased use of probability sampling methods for future election forecasts.
3. A Formal Framework As methods of design-based inference cannot be applied to data obtained by nonprobability sampling, any kind of formal assessment of bias and variability associated with nonprobability samples requires a model-based approach (see Sample Sureys: Modelbased Approaches). Smith (1983) considered the following framework, which can be used to assess the validity of inferences from various kinds of nonprobability samples. Let i l 1, …, N denote the population units, vector Yi the values of the unknown survey variables, and vector Zi the values of variables which are known prior to the survey. Let A be a binary variable indicating whether a unit is selected into the sample (Ai l 1) or not (Ai l 0), and let As be the values of A for the observed sample. Smith (1983)
modeled the population values of Y and the selection process jointly through f (Y, As Q Z; θ, φ) l f (Y Q Z; θ) f (As Q Y, Z; φ)
(1)
where θ and φ are distinct model parameters for the population model and selection model respectively. Given As, Y can be partitioned as (Ys, Ys` ) into observed and unobserved values. Inferences based on the observed data model f (Ys Q Z; θ) and extended to the population are said to ignore the selection mechanism, and in situations where this is valid, the selection is said to be ignorable (Rubin, 1976); see Statistical Data, Missing. Selection is ignorable when f (As Q Y, Z; φ) l f (As Q Z; φ)
(2)
so that the probability of making the observed selection, for given Z, is the same for all Y. A sufficient condition for this is that A and Y are conditionally independent given Z. A probability sampling scheme, perhaps using some stratification or clustering based on Z, is clearly ignorable. Nonprobability sampling schemes based on Z (for example selecting exactly those units corresponding to a particular set of values of Z) are also ignorable. However, whether or not inferences are immediately available for values of Z not contained in the sample depends on the form of the population model f(Y Q Z; θ) and, in particular, whether the entire θ is estimable using Ys. If Y is independent of Z then there is no problem, but this is an assumption which cannot be verified by sample data based on a restricted sample of values of Z. If this assumption seems implausible, then post-stratification may help. Smith (1993) considered partitioning the variables comprising Y into measurement variables Ym and stratification variables Yq, and post-stratifying. If f (Ym Q Yqs, Z; ξ) l f (Ym Q Yqs; ξ) s s
(3)
where ξ are parameters for the post-stratification model, then inference for any Z is available. This condition implies that, given the observed values Ym of s the stratification variables, Z gives no further information concerning the measurement variables. This approach provides a way of validating certain inferences based on a convenience sample, where Z is an indicator variable defining the sample. Smith (1983) also considered ignorability for quota sampling schemes. He proposed modeling selection into a quota sample in two stages, selection into a larger sample for whom quota variables Yq are recorded, followed by selection into the final sample, based on a unit’s quota variables and the requirements to fill the quota. For the final sample, the variables of interest Ym are recorded. Two ignorability conditions result, requiring that at neither stage does probability of selection, given Yq and Z, depend on Ym. 13469
Sample Sureys: Nonprobability Sampling This formal framework makes clear, through expressions such as (2) and (3) when model-based inferences from nonprobability samples can and cannot be used to provide justifiable population inferences. However, it is important to realize that the assumptions required to ensure ignorability cannot be verified using the sample data alone. They remain assumptions which need to be subjectively justified before extending any inferences to a wider population. These formal concepts of ignorability confirm more heuristic notions of what is likely to comprise a good nonprobability sampling scheme. For example, opinion polls with a large element of self-selection are highly unlikely to result in an ignorable selection. On the other hand one might have much more faith in a carefully constructed quota sampling scheme, where data collectors are assigned to narrowly defined geographical areas, chosen using a probability sampling scheme, and given restrictive guidelines on choosing the units to satisfy their quota.
4. Discussion The distinction between probability sampling and nonprobability sampling is necessarily coarse. At one extreme is a carefully constructed probability survey with no nonresponse; at the other extreme is a sample chosen entirely for the investigator’s convenience. However, most surveys fall between these two extremes, and therefore strictly should be considered as nonprobability samples. Examples include quota surveys of households where the geographical areas for investigation are chosen using a probability sample, or statistical experiments where a convenience sample of units is assigned treatments using a randomization scheme. The validity of any inferences extended to a wider population depends on the extent to which the selection of units is ignorable for the inference required. This applies equally to any survey with nonresponse. The presence of nonresponders in a probability survey introduces a nonprobability element into the selection mechanism. Considerations of ignorability (of nonresponse) now need to be considered. However, surveys with probability sampling usually make a greater effort to minimize nonresponse than nonprobability surveys, where there is little incentive to do so. Furthermore, even with nonresponse, it is easier to justify ignorability of a probability sampling mechanism. Further details concerning specific issues may be obtained from the sources referenced above. Alternative perspectives on nonprobability sampling are provided by general texts on sampling such as Hansen et al. (1953), Stephan and McCarthy (1958) and Moser and Kalton (1971). See also: Sample Surveys, History of; Sample Surveys: Survey Design Issues and Strategies 13470
Bibliography Bradley N 1999 Sampling for internet surveys. An examination of respondent selection for internet research. Journal of the Market Research Society 41: 387–95 Deville J-C 1991 A theory of quota surveys. Surey Methodology 17: 163–81 Doll R, Hill A B 1964 Mortality in relation to smoking: ten years’ observations of British doctors. British Medical Journal 1: 1399–410 Fienberg S E, Tanur J M 1989 Combining cognitive and statistical approaches to survey design. Science 243: 1017–22 Hansen M H, Hurwitz W N, Madow W G 1953 Sample Surey Methods and Theory. Volume 1: Methods and Applications. Wiley, New York Lynn P, Jowell R 1996 How might opinion polls be improved? The case for probability sampling. Journal of the Royal Statistical Society A 159: 21–8 Market Research Society Working Party 1994 The Opinion Polls and the 1992 General Election. Market Research Society, London Marsh C, Scarborough E 1990 Testing nine hypotheses about quota sampling. Journal of the Market Research Society 32: 485–506 Moser C A, Kalton G 1971 Surey Methods in Social Inestigation. Heinemann, London Moser C A, Stuart A 1953 An experimental study of quota sampling (with discussion). Journal of the Royal Statistical Society A 116: 349–405 Mosteller F, Hyman H, McCarthy P J, Marks E S, Truman D B 1949 The Pre-election Polls of 1948: Report to the Committee on Analysis of Pre-election Polls and Forecasts. Social Science Research Council, New York Rubin D B 1976 Inference and missing data. Biometrika 63: 581–92 Smith T M F 1983 On the validity of inferences from nonrandom samples. Journal of the Royal Statistical Society A 146: 394–403 Smith T M F 1996 Public opinion polls: the UK general election, 1992. Journal of the Royal Statistical Society A 159: 535–45 Smith T M F, Sugden R A 1988 Sampling and assignment mechanisms in experiments, surveys and observational studies. International Statistical Reiew 56: 165–80 Stephan F F, McCarthy P J 1958 Sampling Opinions. An Analysis of Surey Procedure. Wiley, New York Taylor H 1995 Horses for courses: how survey firms in different countries measure public opinion with very different methods. Journal of the Market Research Society 37: 211–19 Worcester R 1996 Political polling: 95% expertise and 5% luck. Journal of the Royal Statistical Society A 159: 5–20
J. J. Forster
Sample Surveys: Survey Design Issues and Strategies A treatment of survey questions intended to be useful to those wishing to carry out or interpret actual surveys should consider several issues: the basic difference between questions asked in surveys and
Sample Sureys: Surey Design Issues and Strategies questions asked in ordinary social interaction; the problems of interpreting tabulations based on single questions; the different types of survey questions that can be asked; the possibility of bias in questioning; and the insights that can be gained by combining standard survey questions with randomized experiments that vary the form, wording, and context of the questions themselves. Each of these issues is treated in this article.
1. The Unique Nature of Surey Questioning A fundamental paradox of survey research is that we start from the purpose of ordinary questioning as employed in daily life, yet our results are less satisfactory for that purpose than for almost any other. In daily life a question is usually asked because one person wishes information from another. You might ask an acquaintance how many rooms there are in her house, or whether she favors the legalization of abortion in America. The assumption on both sides of the interaction is that you are interested in her answers in and of themselves. We can call such inquiries ordinary questions. In surveys we use similar inquiries—that is, their form, their wording, and the manner of their asking are seldom sharply distinguishable from ordinary questions. At times we may devise special formats, with names like Likert-type or forced-choice, but survey questions cannot depart too much from ordinary questioning because the essential nature of the survey is communication with people who expect to hear and respond to ordinary questions. Not surprisingly, respondents believe that the interviewer or questionnaire is directly interested in the facts and opinions they give, just as would an acquaintance who asked the same questions. They may not assume a personal interest in their answers, but what they do assume is that their answers will be combined with the answers of all others to give totals that are directly interpretable. Thus, if attitudes or opinions are inquired into, the survey is viewed as a kind of referendum and the investigator is thought to be interested in how many favor and how many oppose legalized abortion or whatever else is at issue. If facts are being asked about, the respondent expects a report telling how many people have what size homes, or whatever the inquiry is about. (By factual data we mean responses that correspond to a physical reality and could, in principle, be provided by an observer as well as by a respondent, for example, when counting rooms. By attitudinal data we mean responses that concern subjective phenomena and therefore depend on self-reports by respondents. The distinction is not airtight: for example, the designation of a respondent’s ‘race’ can be based on self-report but also on the observations of others, and the two may differ without either being
clearly ‘wrong.’) In this article the focus will be on attitudes, including opinions, beliefs, and values, though much of the discussion can be applied to factual data as well. Experienced survey researchers know that a simple tally of responses to a question—what survey researchers refer to as the ‘marginals’—are usually too much a function of the way the question was asked to allow for any simple interpretation. The results of questions on legalized abortion depend heavily on the conditions, definitions, and other subtleties presupposed by the question wording, and the same is true to an extent even for a question on how many rooms there are in a house. Either we must keep a question quite general in phrasing and leave the definitions, qualifications, and conditions up to each respondent—which invites unseen variations in interpretation—or we must try to make the question much more limited in focus than was usually our goal in the first place. Faced with these difficulties in interpreting univariate results from separate questions, survey investigators can proceed in one or both of two directions. One approach is to ask a wide range of questions on an issue and hope that the results can be synthesized into a general conclusion, even though this necessarily involves a fair amount of judgment on the part of the researcher. The other direction—the one that leads to standard survey analysis—is to hold constant the question (or the index, if more than a single item is being considered) and make comparisons across time or other variables. We may not be sure of exactly what 65 percent means in terms of general support for legalized abortion, but we act on the assumption that if the question wording and survey conditions have been kept constant, we can say, within the limits of sampling error, that it represents such and such an increase or decrease from an earlier survey that asked the same question to a sample from the same population. Or if 65 percent is the figure for men and 50 percent is the figure for women, a sex difference of approximately 15 percent exists. Moreover, research indicates that in most cases relationships are less affected by variations in the form of question than are univariate distributions—generalized as the rule of ‘form-resistant correlations’ (Schuman and Presser 1981). The analytic approach, together with use of multiple questions (possibly further combined on the basis of a factor analytic approach), can provide a great deal of understanding and insight into an attitude, though it militates against a single summary statement of the kind that the respondents expect to hear. (Deming’s (1968, p. 601) distinction between enumerative and analytic studies is similar, but he treats the results from enumerative studies as unproblematic, using a simple factual example of counting number of children. In this article, univariate results based on attitude questions are regarded as questionable attempts to simulate actual 13471
Sample Sureys: Surey Design Issues and Strategies referenda. Thus the change in terminology is important.) This difference between what respondents expect— the referendum point of view—and what the sophisticated survey researcher expects—the analytic point of view—is often very great. The respondent in a national survey believes that the investigator will add up all the results, item by item, and tell the nation what Americans think. But the survey investigator knows that such a presentation is usually problematic at best and can be dangerously misleading at worst. Moreover, to make matters even more awkward, political leaders often have the same point of view as respondents: they want to know how many people favor and how many oppose an issue that they see themselves as confronting. Yet it may be neither possible nor desirable for the survey to pose exactly the question the policy maker has in mind, and in any case such a question is likely to be only one of a number of possible questions that might be asked on the issue.
2. Problems with the Referendum Point of View There are several reasons why answers obtained from isolated questions are usually uncertain in meaning. First, many public issues are discussed at a general level as though there is a single way of framing them and as though there are just two sides. But what is called the abortion issue, to follow our previous example, consists of a large number of different issues having to do with the reasons for abortion, the trimester involved, and so forth. Likewise, what is called ‘gun control’ can involve different types of guns and different kinds of controls. Except at the extremes, exactly which of these particular issues is posed and with what alternatives makes a considerable difference in the univariate results. Indeed, often what is reported as a conflict in findings between two surveys is due to their having asked about different aspects of the same general issue. A second problem is that answers to survey questions always depend on the form in which the question is asked, because most respondents treat that form as a constraint on their answers. If two alternatives are given by the interviewer, most respondents will choose one, rather than offering a substitute of their own that they might prefer. For example, in one survey-based experiment the authors identified the problems spontaneously mentioned when a national sample of Americans was asked to name the most important problem facing the country. Then a parallel question was formulated for a comparable sample that included none of the four problems mentioned most often spontaneously, but instead four problems that had been mentioned by less than three percent of the population in toto, though with an invitation to respondents to substitute a different problem if they wished. Despite the invitation, the majority of respon13472
dents (60 percent) chose one of the rare problems offered explicitly, which reflected their unwillingness to go outside the frame of reference provided by the question (Schuman and Scott 1987). Evidently, the form of a question is treated by most people as setting the ‘rules of the game,’ and these rules are seldom challenged even when encouragement is offered. It might seem as though the solution to the rules-ofthe-game constraint is to keep questions ‘open’—that is, not to provide specific alternatives. This is often a good idea, but not one that is failsafe. In a related experiment on important events and changes from the recent past, ‘the development of computers’ was not mentioned spontaneously nearly as often as economic problems, but when it was included in a list of past events along with economic problems, the development of computers turned out to be the most frequent response (Schuman and Scott 1987). Apparently people asked to name an important recent event or change thought that the question referred only to political events or changes, but when the legitimacy of a different kind of response was made explicit, it was heavily selected. Thus a question can be constraining even when it is entirely open and even when the investigator is unaware of how it affects the answers respondents give. A third reason for the limitations of univariate results is the need for comparative data to make sense in interpretation. Suppose that a sample of readers of this article is asked to answer a simple yes\no question as to its value, and that 60 percent reply positively and 40 percent negatively. Leaving aside all the problems of question wording discussed thus far, such percentages can be interpreted only against the backdrop of other articles. If the average yes percentage for all articles is 40 percent, the author might feel proud of his success. If the average is 80 percent, the author might well hang his head in shame. We are all aware of the fundamental need for this type of comparison, yet it is easy to forget about the difficulty of interpreting absolute percentages when we feel the urge to speak definitively about public reactions to a unique event. Finally, in addition to all of the above reasons, there are sometimes subtle features of wording that can affect answers. A classic example of a wording effect is the difference between ‘forbidding’ something and ‘not allowing’ the same thing (Rugg 1941). A number of survey experiments have shown that people are more willing to ‘not allow’ a behavior than they are to ‘forbid’ the same behavior, even though the practical effects of the distinction in wording are nil (Holleman 2000). Another subtle feature is context: for example, a question about abortion in the case of a married woman who does not want any more children is answered differently depending on whether or not it is preceded by a question about abortion in the case of a defective fetus (Schuman and Presser 1996 [1981]). The problems of wording and context appear equally when an actual referendum is to be carried out
Sample Sureys: Surey Design Issues and Strategies by a government: considerable effort is made by politicians on all sides of the issue to control the wording of the question to be voted on, as well as its placement on the ballot, with the battle over these decisions sometimes becoming quite fierce. This shows that there is never a single way to phrase a referendum and that even small variations in final wording or context can influence the outcome of the voting. The same is true for survey questions, but with the crucial difference that they are meant to provide information, not to determine policy in a definitive legal sense. The analytic approach, when combined with use of multiple questions to tap different aspects of an issue, provides the most useful perspective on survey data. Rather than focusing on the responses to individual items as such, the analysis of change over time and of variations across demographic and social background variables provides the surest route to understanding both attitudinal and factual data. Almost all important scholarly work based on surveys follows this path, giving attention to individual percentages only in passing. In addition, in recent years, classic betweensubjects experiments have been built into surveys, with different ways of asking a question administered to random subsamples of a larger probability sample in order to learn about the effects of question wording (Schuman and Presser 1996 [1981]). These surveybased experiments, traditionally called ‘split-ballots,’ combine the advantage of a probability sample survey to generalize to a much larger population with the advantage of randomized treatments to test causal hypotheses. Survey-based experiments have been used to investigate a variety of methodological uncertainties about question formulations, as we will see below, and are also employed increasingly to test hypotheses about substantive political and social issues.
3. Types of Surey Questions When investigators construct a questionnaire, they face a number of decisions about the form in which their questions are to be asked, though the decisions are often not made on the basis of much reflection. The effects of such decisions were first explored in surveybased experiments conducted in the mid-twentieth century and reported in books by Cantril (1944) and Payne (1951). In 1981 (1996), Schuman and Presser provided a systematic review of variations due to question form, along with much new experimental data. Recent books by Sudman et al. (1996), Tanur (1992), Tourangeau et al. (2000), and Krosnick and Fabrigar (forthcoming) consider many of these same issues, as well as a number of additional ones, drawing especially on ideas and research from cognitive psychology(seealsoQuestionnaires:CognitieApproaches; Sample Sureys: Cognitie Aspects of Surey Design).
An initial important decision is whether to ask a question in open or closed form. Open questions, where respondents answer in their own words and these are then coded into categories, are more expensive in terms of both time and money than closed questions that present two or more alternatives that respondents choose from. Hence, open questions are not common in present-day surveys, typically being restricted to questions that attempt to capture rapid change and that are easy to code, as in standard inquiries about ‘the most important problem facing the country today.’ In this case, immediate salience is at issue and responses can usually be summarized in keyword codes such as ‘unemployment,’ ‘terrorism,’ or ‘race relations.’ An open-ended approach is also preferable when numerical answers are wanted, for example, how many hours of television a person watches a week. Schwarz (1996) has shown that offering a specific set of alternatives provides reference points that can shape answers, and thus it is probably better to leave such questions open, as recommended by Bradburn et al. (1979). More generally, open and closed versions of a question often do lead to different univariate response distributions and to different multivariate relations as well (Schuman and Presser 1996, [1981]). Partly this is due to the tendency of survey investigators to write closed questions on the assumption that they themselves know how to frame the main choices, which can lead to their overlooking alternatives or wording especially meaningful to respondents. Many years ago Lazarsfeld (1944) proposed as a practical compromise the use of open questions in the early development of a questionnaire, with the results then drawn on to frame closed alternatives that would be more efficient for use in an actual survey. What has come to be called ‘cognitive interviewing’ takes this same notion into the laboratory by studying carefully how a small number of individuals think about the questions they are asked and about the answers they give (see several chapters in Schwarz and Sudman 1996). This may not eliminate all open\closed differences, but it helps investigators learn what is most salient and meaningful to respondents. Even after closed questions are developed, it is often instructive to include follow-up ‘why’ probes of answers in order to gain insight into how respondents perceived the questions and what they see their choices as meaning. Since it is not practical to ask such followups to all respondents about all questions, Schuman (1966) recommended the use of a ‘random probe’ technique to obtain answers from a subsample of the larger sample of questions and respondents. When the focus is on closed questions, as it often is, a number of further decisions must be made. A frequently used format is to state a series of propositions, to each of which the respondent is asked to indicate agreement or disagreement. Although this is an efficient way to proceed, there is considerable 13473
Sample Sureys: Surey Design Issues and Strategies evidence that a substantial number of people, especially those with less education, show an ‘acquiescence bias’ when confronted with such statements (Krosnick and Fabrigar forthcoming). The main alternative to the agree\disagree format is to require respondents to make a choice between two or more statements. Such a balanced format encourages respondents to think about the opposing alternatives, though it also requires investigators to reduce each issue to clearly opposing positions. Another decision faced by question writers is how to handle DK (don’t know) responses. The proportion of DK answers varies not only by the type of issue—there are likely to be more to a remote foreign policy issue than to a widely discussed issue like legalization of abortion (Converse 1976–77)—but also by how much DK responses are encouraged or discouraged. At one extreme, the question may offer a DK alternative as one of the explicit choices for respondents to consider, even emphasizing the desirability of it being given if the respondent lacks adequate information on the matter. At the other extreme interviewers may be instructed to urge those who give DK responses to think further in order to provide a more substantive answer. In between, a DK response may not be mentioned by the interviewer but can be accepted when volunteered. Which approach is chosen depends on one’s beliefs about the meaning of a DK response. Those who follow Converse’s (1964) emphasis on the lack of knowledge that the majority of people possess about many public issues tend to encourage respondents to consider a DK response as legitimate. Those who argue like Krosnick and Fabrigar forthcoming) that giving a DK response is mainly due to ‘satisficing’ prefer to press respondents to come up with a substantive choice. Still other possibilities are that DKs can involve evasion in the case of a sensitive issue (e.g., racial attitudes) and in such cases it is unclear what respondents will do if prevented from giving a DK response. A more general methodological issue is the desirability of measuring attitude strength. Attitudes are typically defined as favorable or unfavorable evaluations of objects, but the evaluations can also be seen as varying in strength. One can strongly favor or oppose the legalization of abortion, for example, but hold an attitude toward gun registration that is weaker, or vice versa. Further, there is more than one way to measure the dimension of strength, as words like ‘extremity,’ ‘importance,’ ‘certainty,’ and ‘strength’ itself suggest, and thus far it appears that the different methods of measurement are far from perfectly correlated (Petty and Krosnick 1995). Moreover, although there is evidence that several of these strength measures are related to actual behavior, for example, donating money to one side of the dispute about the legalization of abortion, in the case of gun registration the relation has been much weaker, 13474
apparently because other social factors (i.e., the effectiveness of gun lobbying organizations) play a large role independent of attitude strength (Schuman and Presser 1996 [1981].
4. Question Wording and Bias Every question in a survey must be conveyed in words and words are never wholly neutral and unproblematic. Words have tone, connotation, implication— which is why both a referendum and a survey are similar in always coming down to a specific way of describing an issue. Of course, sometimes a question seems to have been deliberately biased, as when a ‘survey’ mailed out by a conservative organization included the following question: Do you believe that smut peddlers should be protected by the courts and the Congress, so they can openly sell pornographic materials to your children?
But more typical are two versions of a question that was asked during the Vietnam War: If a situation like Vietnam were to develop in another part of the world, do you think the United States should or should not send troops [to stop a communist takeoer]?
Mueller (1973) found that some 15 percent more Americans favored military action when the bracketed words were included than when they were omitted. Yet it is not entirely clear how one should regard the phrase ‘to stop a communist takeover’ during that period. Was it ‘biasing’ responses to include the phrase, or was it simply informing respondents of something they might like to have in mind as they answered? Furthermore, political leaders wishing to encourage or discourage an action can choose how to phrase policy issues, and surveys cannot ignore the force of such framing if they wish to be relevant to important political outcomes. Another instructive example was the attempt to study attitudes during the 1982 war between Argentina and Britain over ownership of a small group of islands in the South Atlantic. It was virtually impossible to phrase a question that did not include the name of the islands, but for the Argentines they were named the Malvinas and for the British the Falkland Islands. Whichever name was used in a question could be seen as prejudicing the issue of ownership. This is an unusual example, but it shows that bias in survey questions is not a simple matter.
5. Conclusion In this sense, we return to the fundamental paradox of survey research—the referendum point of view vs. the analytic point of view. We all wish at times to know
Sample Sureys: The Field what the public as a whole feels about an important issue—whether it involves military intervention in a distant country or a domestic issue like government support for health care. Therefore, we need to remind both ourselves and the public of the limitations of univariate survey results, while at the same time taking whatever steps we can to reduce those limitations. Above all, this means avoiding the tendency to reduce a complex issue to one or two simple closed questions, because every survey question imposes a unique perspective on responses, whether we think of this as ‘bias’ or not. Moreover, survey data are most meaningful when they involve comparisons, especially comparisons over time and across important social groups—provided that the questions have been kept as constant in wording and meaning as possible. From a practical standpoint, probably the most useful way to see how the kinds of problems discussed here can be addressed is to read significant substantive analyses of survey data, for example, classics like Stouffer (1955) and Campbell et al. (1960), and more recent works that grapple with variability and bias, for example, Page and Shapiro (1992) and Schuman et al. (1997).
Bibliography Cantril H 1944 Gauging Public Opinion. Princeton University Press, Princeton, NJ Bradburn N M, Sudman S, with assistence of Blair E, Locander W, Miles C, Singer E, Stocking C 1979 Improing Interiew Method and Questionnaire Design. Jossey-Bass, San Francisco Campbell A, Converse P E, Miller W E, Stokes D E 1960 The American Voter. Wiley, New York Converse J M 1987 Surey Research in the United States: Roots & Emergence 1890–1960. University of California Press, Berkeley, CA Converse P 1964 The nature of belief systems in mass publics. In: Apter D E (ed.) Ideology and Discontent. Free Press, New York Converse J M 1976–7 Predicting no opinion in the polls. Public Opinion Quarterly? 40: 515–30 Deming W E 1968 Sample surveys: the field. In: Sills D (ed.) International Encyclopedia of the Social Sciences. Macmillan, New York, Vol. 13, pp. 594–612 Holleman B H 2000 The Forbid\Allow Asymmetry: On the Cognitie Mechanisms Underlying Wording Effects in Sureys. Rodopi, Amsterdam Krosnick J A, Fabrigar L A forthcoming Designing Great Questionnaires: Insights from Social and Cognitie Psychology. Oxford University Press, Oxford, UK Lazarsfeld P E 1944 The controversy over detailed interviews— an offer to negotiate. Public Opinion Quarterly 8: 38–60 Mueller J E 1973 War, Presidents, and Public Opinion. Wiley, New York Page B I, Shapiro R Y 1992 The Rational Public: Fifty Years of Trends in Americans’ Policy Preferences. University of Chicago Press, Chicago Payne S L 1951 The Art of Asking Questions. Princeton University Press, Princeton, NJ
Petty R E, Krosnick J A (eds.) 1995 Attitude Strength: Antecedents and Consequences. Erlbaum, Mahwah, NJ Rugg D 1941 Experiments in wording questions: II. Public Opinion Quarterly 5: 91–2 Schuman H 1966 The random probe: a technique for evaluating the validity of closed questions. American Sociological Reiew 21: 218–22 Schuman H, Presser S 1981 Questions and Answers in Attitude Sureys: Experiments on Question Form, Wording, and Context. Academic Press, New York [reprinted 1996, Sage Publications, Thousand Oaks, CA] Schuman H, Scott J 1987 Problems in the use of survey questions to measure public opinion. Science 236: 957–9 Schuman H, Steeh C, Bobo L, Krysan M 1997 Racial Attitudes in America: Trends and Interpretations. Harvard University Press, Cambridge, MA Schwarz N 1996 Cognition and Communication: Judgmental Biases, Research Methods, and the Logic of Conersation. Erlbaum, Mahwah, NJ Schwarz N, Sudman S (eds.) 1996 Answering Questions. JosseyBass, San Francisco Stouffer S A 1955 Communism, Conformity, and Ciil Liberties. Doubleday, New York Sudman S, Bradburn N B, Schwarz N 1996 Thinking About Answers: The Application of Cognitie Processes to Surey Methodology. Jossey-Bass, San Francisco Tanur J M (ed.) 1992 Questions About Questions: Inquiries Into the Cognitie Bases of Sureys. Russell Sage Foundation, New York Tourangeau R, Ripps L J, Rasinski K A 2000 The Psychology of Surey Response. Cambridge University Press, Cambridge, UK
H. Schuman
Sample Surveys: The Field 1. Definition of Surey Sampling Survey sampling can be defined as the art of selecting a sample of units from a population of units, creating measurement tools for measuring the units with respect to the survey variables and drawing precise conclusions about the characteristics of the population or of the process that generated the values of the units. A more specific definition of a survey is the following (Dalenius 1985): (a) A surey concerns a set of objects comprising a population. One class of population concerns a finite set of objects such as individuals, businesses, and farms. Another, concerns events during a specific time period, such as crime rates and sales. A third class concerns plain processes, such as land use or the occurrence of certain minerals in an area. More specifically one might want to define a population as, for example, all noninstitutionalized individuals 15–74 years of age living in Sweden on May 1, 2000. (b) This population has one or more measurable 13475
Sample Sureys: The Field properties. Examples of such properties are individuals’ occupations, business’ revenues, and the number of elks in an area. (c) A desire to describe the population by one or more parameters defined in terms of these properties. This calls for observing (a sample of ) the population. Examples of parameters are the proportion of unemployed individuals in the population, the total revenue of businesses in a certain industry sector during a given time period and the average number of elks per square mile. (d) In order to get observational access to the population, a frame is needed i.e., an operational representation, such as a list of the population objects or a map of the population. Examples of frames are business and population registers; maps where the land has been divided into areas with strictly defined boundaries; or all n-digit numbers, which can be used to link telephone numbers to individuals. Sometimes the frame has to be developed for the occasion because there are no registers available and the elements have to be listed. For general populations this is done by combining a multi-stage sampling and the listing procedure by letting the survey field staff list all elements in sampled areas only. Other alternatives would be too costly. For special populations, for example, the population of professional baseball players in the USA, one would have to combine all club rosters into one frame. In some surveys there might exist a number of frames covering the population to varying extents. For this situation a multiple frame theory has been developed (see Hartley 1974). (e) A sample of sampling units is selected from the frame in accordance with a sampling design, which specifies a probability mechanism and a sample size. There are numerous sample designs (see Sample Sureys: Methods) developed for different survey situations. The situation may be such that the design chosen solves a problem (using multistage sampling when not all population elements can be listed, or when interviewer and travel costs prevent the use of simple random sampling of elements) or takes advantage of the circumstances (using systematic sampling, if the population is approximately ordered, or using stratified sampling if the population is skewed). Every sample design specifies selection probabilities and a sample size. It is imperative that selection probabilities are known, or else the design is nonmeasurable. (f ) Observations are made on the sample in accordance with a measurement design i.e., a measurement method and a prescription as to its use. This phase is called data collection. There are at least five different main modes of data collection: face-to-face interviewing, telephone interviewing, self-administered questionnaires and diaries, administrative records, and direct observation. Each of these modes can be conducted using different levels of technology. Early attempts using the computer took place in the 13476
1970s, in telephone interviewing. The questionnaire was stored in a computer and a computer program guided the interviewer throughout the interview by automatically presenting questions on the screen and taking care of some interviewer tasks such as keeping track of skip patterns and personalizing the interview. This technology is called CATI (Computer Assisted Telephone Interviewing). Current levels of technology for the other modes include the use of portable computers for face to face interviewing, touch-tone data entry using the telephone key pad, automatic speech recognition, satellite images of land use and crop yields, ‘people meters’ for TV viewing behaviors, barcode scanning in diary surveys of purchases, electronic exchange of administrative records, and Internet. Summaries of these developments are provided in Lyberg and Kasprzyk (1991), DeLeeuw and Collins (1997), Couper et al. (1998) and Dillman (2000). Associated with each mode is the survey measurement instrument or questionnaire. The questionnaire is the result of a conceptualization of research objectives i.e., a set of properly worded and properly ordered questions. The design of the questionnaire is a science of its own. See for example Tanur (1992) and Sudman et al. (1996). (g) Based on the measurements an estimation design is applied to compute estimates of the parameters when making inference from the sample to the population. Associated with each sampling design are one or more estimators that are functions of the data that have been collected to make statements about the population parameters. Sometimes estimators rely solely on sample data, but on other occasions auxiliary information is part of the function. All estimators include sample weights that are used to inflate the sample data. To calculate the error of an estimate, variance estimators are formed, which makes it possible to calculate standard errors and eventually confidence intervals. See Cochran (1977) and Sa$ rndal et al. (1992) for comprehensive reviews of the sampling theory.
2. The Status of Surey Research There are many types of surveys and survey populations that fit this definition. A large number of surveys are one-time surveys aiming at measuring attitudes or other population behaviors. Some surveys are continuing, thereby allowing the estimation of change over time. An example of this is a monthly labor force survey. Typically such a survey uses a rotating design where a sampled person is interviewed a number of times. One example of this is that the person participates 4 months in a row, is rotated out of the sample for the next 4 months and then rotates back for a final 4 months. Other surveys aim at comparing different populations regarding a certain characteristic, such as the literacy level in different countries. Business
Sample Sureys: The Field surveys often study populations where there are a small number of large businesses and many smaller ones. In the case where the survey goal is to estimate a total, it might be worthwhile to deliberately cut off the smallest businesses from the frame or select all large businesses with a probability of one and the smaller ones with other probabilities. Surveys are conducted by many different organizations. There are national statistical offices producing official statistics, there are university-based organizations conducting surveys as part of the education and there are private organizations conducting surveys on anything ranging from official statistics to marketing. The survey industry employs more than 130,000 people only in the USA, and the world figure is of course much larger. Survey results are very important to society. Governments get continuing information on parameters like unemployment, national accounts, education, environment, consumer price indexes, etc. Other sponsors get information on e.g., political party preferences, consumer satisfaction, child day-care needs, time use, and consumer product preferences. As pointed out by Groves (1989), the field of survey sampling has evolved through somewhat independent and uncoordinated contributions from many disciplines including statistics, sociology, psychology, communication, education and marketing research. Representatives of these disciplines have varying backgrounds and as a consequence tend to emphasize different design aspects. However, during the last couple of decades, survey research groups have come to collaborate more as manifested by, for instance, the edited volumes such as Groves et al. (1988), Biemer et al. (1991), Lyberg et al. (1997), and Couper et al. (1998). This teamwork development will most likely continue. Many of the error structures resulting from specific sources must be dealt with by multi-disciplinary teams since the errors stem from problems concerning sampling, recall, survey participation, interviewer practices, question comprehension, and conceptualization. (See Sample Sureys: Cognitie Aspects of Surey Design.) The justification for sampling (rather than surveying the entire population, a total enumeration) is lower costs but also greater efficiency. Sampling is faster and less expensive compared to total enumeration. Perhaps more surprisingly, sampling often allows a more precise measurement of each sampled unit than that possible in a total enumeration. This often leads to sample surveys having quality features that are superior to those of total enumerations. Sampling as an intuitive tool has probably been used for centuries, but the development of a theory of survey sampling did not start until the late 1800s. Main contributors to this early development, frequently referred to as ‘the representative method,’ were Kiaer (1897), Bowley (1913, 1926), and Tschuprow (1923). Apart from various inferential aspects they discussed issues such as stratified sam-
pling, optimum allocation to strata, multistage sampling, and frame construction. In the 1930s and the 1940s most of the basic methods that are used today were developed. Fisher’s randomization principle was applied to sample surveys and Neyman (1934, 1938) introduced the theory of confidence intervals, cluster sampling, ratio estimation, and two-phase sampling. The US Bureau of the Census was perhaps the first national statistical office to embrace and further develop the theoretical ideas suggested. For example, Morris Hansen and William Hurwitz (1943, 1949) and Hansen et al. (1953) helped place the US Labor Force Survey on a full probability-sampling basis and they also led innovative work on variance estimation and the development of a survey model decomposing the total survey mean squared error into various sampling and bias components. Other important contributions during that era include systematic sampling (Madow and Madow 1944), regression estimation (Cochran 1942), interpenetrating samples (Mahalanobis 1946) and master samples (Dalenius 1957). More recent efforts have concentrated on allocating resources to the control of various sources of error i.e., methods for total survey design, taking not only sampling but also nonsampling errors into account. A more comprehensive review of historical aspects are provided in Sample Sureys, History of.
3. The Use of Models While early developments focused on methods for sample selection in different situations and proper estimation methods, later developments have to a large extent focused on theoretical foundations and the use of probability models for increasing the efficiency of the estimators. There has been a development from implicit modeling to explicit modeling. The model traditionally used in the early theory is based on the view that what is observed for a unit in the population is basically a fixed value. This approach may be called the ‘fixed population approach.’ The stochastic nature of the estimators is a consequence of the deliberately introduced randomization among the population units. A specific feature of survey sampling is the existence of auxiliary information i.e., known values of a concomitant variable, which is in some sense related to the variable under study, so that it can be used to improve the precision of the estimators. The relationship between the variable under study and the auxiliary variables are often expressed as linear regression models, which often can be interpreted as expressing the belief (common or the sampler’s own) concerning the structure of the relationship between the variables. Such modeling is used extensively in early textbooks, see Cochran 1953. A somewhat different approach is to view the values of the variables 13477
Sample Sureys: The Field as realizations of random variables using probability models. In combination with the randomization of the units this constitutes what is called the superpopulation approach. Model based inference is used to draw conclusions based solely on properties of the probability models ignoring the randomization of the units. Design based inference on the other hand ignores the mechanism that generated the data and concentrates on the randomization of the units. In general, model-based inference for estimating population parameters like means of subgroups can be very precise if the model is true but may introduce biased estimates if the model is false, while design-based inference leads to unbiased, but possibly inferior estimates of the population parameters. Model assisted inference is a compromise that aims at utilizing models in such a way that, if the model is true the precision is high, but if the model is false the precision will be no worse than if no model had been used. (See Sample Sureys: Model-based Approaches.) As an example, we want to study a population of families in a country. We want to analyse the structure of disposable income for the households and find out the relation between factors like age, sex, education, the number of household members, and the disposable income for a family. A possible model of the data generating process could be that the disposable income is a linear function of these background variables. There is also an element of unexplained variation between families having the same values of the background variables. Also, the income will fluctuate from year to year depending on external variation in society. All this shows that the data generating process could be represented by a probability model where the disposable income is a linear function of background variables and random errors over time and between families. The super population model would be the set of models describing how the disposable income is generated for the families. For inferential purposes, a sample of families is selected. Different types of inference can be considered. For instance we might be interested in giving a picture of the actual distribution of the disposable income in the population at the specific time when we selected the sample, or we might be interested in estimating the coefficients of the relational model either because we are genuinely interested in the model itself e.g., for prediction of a future total disposable income for the population, which would be of interest for sociologists, economists and decision makers, or for using the model as a tool for creating more efficient estimators of the fixed distribution, given for example that the distribution of sex and age is known with reasonable accuracy in the population and can be used as auxiliary information. Evidently, the results would depend on the constellation of families comprising our sample. If we use a sample design that over-represents the proportion of large households or young households with small 13478
children, compared to the population, the inference based on the sample can be misleading. Model-based inference ignores the sample selection procedure and assumes that the inference conditional on the sample is a good representation of what would have been the case if all families had been surveyed. Design-based inference ignores the data generation process and concentrates on the artificial randomisation induced by the sampling procedure. Model-assisted inference uses models as tools for creating more precise estimates. Broadly speaking, model-based inference is mostly used in the case when the relational model is of primary interest. This is the traditional way of analysing sample data as given in textbooks in statistical theory. Design-based inference on the other hand is the traditional way of treating sample data in survey sampling. It is mainly focused on giving a picture of the present state of the population. Model-assisted inference uses models as tools for selecting estimators, but relies on design properties. It too is mainly focused on picturing the present state in the population. Modern textbooks such as Cassel et al. (1977) and Sa$ rndal et al. (1992) discuss the foundations of survey sampling and make extensive use of auxiliary information in the survey design. The different approaches mentioned above have their advocates, but most of the surveys conducted around the world still rely heavily on design-based approaches with implicit modeling. But models are needed to take nonsampling errors into account since we do not know exactly how such errors are generated. To make measurement errors part of the inference procedure, one has to make assumptions about the error structures. Such error structures concern cognitive issues, question wording and perception, interviewer effects, recall errors, untruthful answers, coding, editing and so on and so forth. Similarly, to make errors of nonobservation (frame coverage and nonresponse errors) part of the inference procedure, one needs to model the mechanisms that generate these errors. The compromise called model-assisted inference takes advantage of both design and model- based features. Analysis of data from complex surveys denotes the situation that occurs when the survey statistician is trying to estimate the parameters of a model used for description of a random phenomenon, for example econometric or sociological models such as time series models, regression models or structural equation models. It is assumed that the data available are sample survey data that have been generated by some sampling mechanism that does not support the assumption of independent identically distributed (IID) observations on a random variable. The traditional inference developed for the estimation of parameters of the model (and not for estimating the population parameters) presupposes that IID is at hand. In some cases, traditional inference based on e.g., maximum likelihood gives misleading results. Comprehensive reviews of analysis of data from complex surveys are
Sample Sureys: The Field provided by Skinner et al. (1989) and Lehtonen and Pahkinen (1995). The present state of affairs is that there is a relatively well-developed sampling theory. The theory of nonsampling errors is still in its infancy, however. A typical scenario is that survey methodologists try to reduce potential errors by using for example, cognitively tested questionnaires and various means to stimulate survey participation, and these things are done to the extent available resources permit. However, not all nonsampling error sources are known and some that are known defy expression. The error reduction strategy can be complemented by sophisticated modeling of error structures. Unfortunately, a rather common implicit modeling seems to be that nonsampling errors have no serious effect on estimates. In some applications, attempts are made to estimate the total error or error components by evaluation techniques, i.e., for a subsample of the units, the survey is replicated using expensive ‘gold standard’ methods and the differences between the preferred measurements and the regular ones are used as estimates of the total errors. This is an expensive and time-consuming procedure that is not very suitable for long-range improvements. A more modern and realistic approach is to develop reliable and predictable (stable) survey processes that can be continuously improved (Morganstein and Marker 1997).
4. Conclusions Obviously there are a number of future challenges in the field of survey sampling. We will provide just a few examples: (a) Many surveys are conducted in a primitive way because of limited funding and knowhow. The development of more efficient designs taking nonsampling errors into account at the estimation stage is needed. There is also a need for strategies that can help allocate resources to various design stages so that total errors are minimized. Sometimes those in charge of surveys concentrate their efforts on the most visible error sources or where there is a tool available. For instance, most survey sponsors know that nonresponse might be harmful. The indicator of nonresponse error, the nonresponse rate, is both simple and visible. Therefore it might be tempting to put most resources into this error source. On the other hand, not many users are aware of the cognitive phenomena that affect the response delivery mechanism. Perhaps from a total error point of view more resources should be spent on questionnaire design. (b) Modern technology permits simultaneous use of multiple data collection modes within a survey. Multiple modes are used to accommodate respondents, to increase response rates and to allow inexpensive data collection when possible. There are, however, mode effects and
there is a need for calibration techniques that can adjust the measurements or the collection instruments so that the mode effect vanishes. (c) International surveys are becoming increasingly important. Most methodological problems mentioned are inflated under such circumstances. Especially interesting is the concept of cultural bias. Cultural bias means that concepts and procedures are not uniformly understood, interpreted, and applied across geographical regions or ethnical subpopulations. To define and measure the impact of such bias is an important challenge. See also: Databases, Core: Demography and Registers; Databases, Core: Political Science and Political Behavior; Databases, Core: Sociology; Microdatabases: Economic; Survey Research: National Centers
Bibliography Biemer P, Groves R, Lyberg L, Mathiowetz N, Sudman S 1991 Measurement Errors in Sureys. Wiley, New York Bowley A L 1913 Working-class households in reading. Journal of the Royal Statistical Society 76: 672–701 Bowley A L 1926 Measurement of the precision attained in sampling. Proceedings of the International Statistical Institute XII: 6–62 Cassel C-M, Sa$ rndal C-E Wretman J 1977 Foundations of Inference in Surey Sampling. Wiley, New York Cochran W G 1942 Sampling theory when the sampling-units are of unequal sizes. Journal of the American Statistical Association 37: 199–212 Cochran W G 1953 Sampling Techniques, 1st edn. Wiley, New York Cochran W G 1977 Sampling Techniques, 3rd edn. Wiley, New York Couper M, Baker R, Bethlehem J, Clark C, Martin J, Nicholls W, O’Reilly J 1998 Computer Assisted Surey Information Collection. Wiley, New York Dalenius T 1957 Sampling in Sweden. Almqvist and Wiksell, Stockholm, Sweden Dalenius T 1985 Elements of Surey Sampling. Notes prepared for the Swedish Agency for Research Cooperation with Developing Countries (SAREC) DeLeeuw E, Collins M 1997 Data collection methods and survey quality: An overview. In: Lyberg L, Biemer P, Collins M, DeLeeuw E, Dippo C, Scwarz N, Trewin D (eds.) Surey Measurement and Process Quality. Wiley, New York Dillman D 2000 Mail and Internet Sureys: The Tailored Design Method, 2nd edn, Wiley, New York Groves R 1989 Surey Errors and Surey Costs. Wiley, New York Groves R, Biemer P, Lyberg L, Massey J, Waksberg J (eds.) 1988 Telephone Surey Methodology. Wiley, New York Hansen M H, Hurwitz W N 1943 On the theory of sampling from finite populations. Annals of Mathematical Statistics 14: 333–62 Hansen M H, Hurwitz W N 1949 On the determination of optimum probabilities in sampling. Annals of Mathematical Statistics 20: 426–32
13479
Sample Sureys: The Field Hansen M H, Hurwitz W N, Madow W G 1953 Sample Surey Methods and Theory (I and II). Wiley, New York Hartley H O 1974 Multiple methodology and selected applications. Sankhya, Series C 36: 99–118 Kiaer A N 1897 The representative method for statistical surveys (original in Norwegian). Kristiania Videnskabsselskabets Skrifter. Historisk-filosofiske klasse 4: 37–56 Lehtonen R, Pahkinen E J 1995 Practical Methods for Design and Analysis of Complex Sureys. Wiley, New York Lyberg L, Biemer P, Collins M, DeLeeuw E, Dippo, C, Schwarz N, Trewin D (eds.) 1997 Surey Measurement and Process Quality. Wiley, New York Lyberg L, Kasprzyk D 1991 Data collection methods and measurement error: An overview. In: Biemer P, Groves R, Lyberg L, Mathiowetz N, Sudman S (eds.) Measurement Errors in Sureys. Wiley, New York Madow W G, Madow L H 1944 On the theory of systematic sampling, I. Annals of Mathematical Statistics 15: 1–24 Mahalanobis P C 1946 On large-scale sample surveys. Philosophical Transactions of the Royal Society London, Series B 231: 329–451 Morganstein D, Marker D 1997 Continuous Quality Improvement in Statistical Agencies. In Lyberg L, Biemer P, Collins M, DeLeeuw E, Dippo C, Scwarz N, Trewin D (eds.) Surey Measurement and Process Quality. Wiley, New York Neyman J 1934 On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society 97: 558–625 Neyman J 1938 Contribution to the theory of sampling human populations. Journal of the American Statistical Association 33: 101–16 Sa$ rndal C-E, Swensson B, Wretman J 1992 Model Assisted Surey Sampling. Springer-Verlag, New York Skinner C, Holt D, Smith T M F (eds.) 1989 Analysis of Complex Sureys. Wiley, New York Sudman S, Bradburn N, Schwarz N 1996 Thinking About Answers: The Application of Cognitie Processes to Surey Methodology. Jossey-Bass, San Francisco, CA Tanur J (ed.) 1992 Questions About Questions. Russell Sage, New York Tschuprow A A 1923 On the mathematical expectation of the moments of frequency distributions in the case of correlated observation. Metron 2: 461–93; 646–80
L. Lyberg and C. M. Cassel
Sanctions in Political Science A sanction is an action by one actor (A) intended to affect the behavior of another actor (B) by enhancing or reducing the values available to B. Influence attempts by A using actual or threatened punishments of B are instances of negative sanctions. Influence attempts by A using actual or promised rewards to B are instances of positive sanctions. Not all influence attempts involve sanctions. Actor A may influence actor B by reason, example, or the provision of information without the use of sanctions. 13480
1. Concepts Although the definitions of positive and negative sanctions may appear simple, there are both conceptual and empirical difficulties in distinguishing between the two. Some things take the form of positive sanctions, but actually are not; e.g., giving a bonus of $100 to an employee who expected a bonus of $500, or promising not to kill a person who never expected to be killed in the first place. Likewise, some things take the form of negative sanctions, but actually are not; e.g., a threat to reduce the salary of a person who expected to be fired or the beating of a masochist. Is withholding a reward ever a punishment? Always a punishment? Is withholding a punishment ever a reward? Always a reward? The answers depend on Actor B’s perception of the situation. In order to distinguish rewards from punishments one must establish B’s baseline of expectations at the moment A’s influence attempt begins (Blau 1986). This baseline is defined in terms of B’s expected future value position, i.e., expectations about B’s future position relative to the things B values. Positive sanctions, then, are actual or promised improvements in B’s value position relative to B’s baseline of expectations, and negative sanctions are actual or threatened deprivations relative to the same baseline.
2. Historical Context Although references to both positive and negative sanctions can be traced to the seventeenth century, the most common usage of the term ‘sanctions’ refers to negative sanctions. Until the mid-twentieth century, sanctions were viewed primarily as mechanisms for enforcing societal norms, including those embodied in laws. Although this normative view of sanctions continues in legal theory, ethics, political theory, and sociology (Barry 1995, Coleman 1990, Waldron 1994), a broader usage has emerged in political science during the latter part of the twentieth century. This broader usage depicts sanctions as potentially relevant to any type of influence attempt, regardless of whether it is aimed at enforcing social norms or not. This usage is typically found in discussions of influence and power (Baldwin 1971, 1989, Blau 1986, Oppenheim 1981, Lasswell and Kaplan 1950).
3. Kinds of Sanctions Since sanctions are defined in terms of the values of the target of an influence attempt (actor B), they may take a variety of forms. Lasswell and Kaplan (1950) identified eight exemplary values that could serve as the basis for positive or negative sanctions: power, wealth, rectitude, physical well-being, affection, skill, and enlightenment.
Sanctions in Political Science Although it is sometimes suggested that positive and negative sanctions are opposites in the sense that generalizations about one type are equally applicable to the other, mutatis mutandis, this is often not true. Listed below are some of the hypothesized differences between positive and negative sanctions.
argue that force is the ultimate influence technique. Identifying the conditions under which one type of sanction is more effective than another is likely to continue as a focus of research in political science as well as in other social sciences.
3.5 Legitimation 3.1 A’s Burden of Response When A’s influence attempt is based on a promise, B’s compliance obligates A to respond with a reward; whereas B’s failure to comply calls for no further response from A.
3.2 Role of Costs One important consequence of the asymmetry between positive and negative sanctions is that promises tend to cost more when they succeed, while threats tend to cost more when they fail (Boulding 1989, Parsons 1963, Schelling 1960). The difference can be summarized as follows: The bigger the threat, the higher the probability of success; the higher the probability of success, the less the probability of having to implement the threat; the less the probability of having to implement the threat, the cheaper it is to make big threats. The bigger the promise, the higher the probability of success; the higher the probability of success, the higher the probability of having to implement the promise; the higher the probability of having to implement the promise, the more expensive it is to make big promises (ceteris paribus).
It is usually easier to legitimize demands based on positive sanctions than demands based on negative ones. An example is provided by one of the most important social institutions in the world—the institution of private property. Most societies have one set of rules specifying the conditions under which a person may be deprived of property and quite a different set of rules specifying conditions under which one person may augment another person’s property.
4. Foreign Policy Sanctions Political scientists interested in international relations and foreign policy have devoted a great deal of research to sanctions during the last two decades of the twentieth century. Most of this research has focused on economic sanctions, with comparatively little attention devoted to military, diplomatic, or other noneconomic forms of sanctions. Although progress has occurred with respect to systematic gathering of empirical data (Hufbauer et al. 1990), refinement of concepts (Baldwin 1985), and development of theories, social scientific research on the role of sanctions in international relations is still in its infancy.
3.3 Indicators of Success
4.1 Methodological Problems
Whereas a successful threat requires no action by A, a successful promise obligates A to implement that sanction. In a well-integrated social system, where the probability that B will comply with A’s wishes is relatively high, promises will be more visible than threats. Indeed, it is precisely because threats are so successful in domestic politics that they are so difficult to detect. Since most citizens obey most laws most of the time, threats to punish lawbreakers have to be carried out with respect to only a small minority of citizens in most polities.
Five methodological problems provide useful foci for future research: (a) Lack of agreement on terms and concepts. While some scholars conceive of economic sanctions in terms of mechanisms for making any type of influence attempt (Baldwin 1985), others define economic sanctions in terms of particular policy goals (Pape 1997). This lack of agreement on basic concepts not only impedes debate within the field of foreign policy studies, it is an obstacle to crossfertilization between the study of economic sanctions in foreign policy and discussions of other kinds of sanctions in other areas of social science. (b) Preoccupation with economic sanctions. Research on economic sanctions has been largely insulated from research on other techniques of statecraft. This insulation sometimes gives the impression that economic sanctions are sui generis, rather than treating them as a particular type of statecraft. Although actual or threatened military force is a military sanction, it too is usually viewed as a unique focus of
3.4 Efficacy The question of the relative efficacy of positive versus negative sanctions in exercising influence has been much debated. Parson (1963) has argued that negative sanctions have more intrinsic effectiveness than positive sanctions in deterrence situations. And ever since Machiavelli and Hobbes, there have been those who
13481
Sanctions in Political Science inquiry. In order to understand when and why policy makers use economic sanctions, it is necessary to understand the alternatives among which policy makers choose. These alternatives include diplomatic, military, and\or verbal sanctions (propaganda). Future research on sanctions might usefully focus on comparative studies of alternative types of sanctions and the conditions under which each type is most likely to be cost-effective. (c) Neglect of positie sanctions. Research on sanctions is heavily skewed toward negative sanctions. This is true not only of research by international relations scholars, but also of political science in general. During the last two decades, research on positive sanctions has increased (Davis 2000, Newnham 2000), but the emphasis is still disproportionately on negative sanctions. (d) Lack of agreement on criteria of success. Scholars disagree as to which criteria should be used in measuring the success of economic sanctions. Some consider only the effectiveness of the influence attempt in achieving its goals (Morgan and Schwebach 1997, Pape 1997). Others argue that estimates of success should take into consideration the costs of the sanctions to the user, the costs for noncompliance inflicted on the target, the difficulty of the undertaking, and the comparative utility of alternative policy options (Baldwin 1985, 2000). This lack of agreement on how to conceive of success is not peculiar to the international relations literature. Unlike economics, political science lacks a standardized measure of value in terms of which success can be measured. There is much less agreement among political scientists as to what constitutes political success than there is among economists as to what constitutes economic success (Baldwin 1989). Developing a set of agreed upon criteria of success applicable to noneconomic, as well as economic, sanctions would be a valuable step in sanctions research. (e) Premature generalization. Research on economic sanctions by international relations scholars frequently leads to premature attempts to specify policy implications. Some scholars, for example, imply that only foolish policy makers would use such policy instruments (Morgan and Schwebach 1997, Tsebelis 1990). In order to make judgements as to the wisdom of using economic sanctions in a given situation, however, one would need to know how effective such sanctions were likely to be, with respect to which goals and targets, at what cost, and in comparison with which policy alternatives (Baldwin 2000). Current research on economic sanctions rarely asks such questions. See also: Control: Social; Deterrence; Diplomacy; Efficacy: Political; Foreign Policy Analysis; Game Theory; Law: Economics of its Public Enforcement; Legal Culture and Legal Consciousness; Legitimacy: Political; National Security Studies and War Potential 13482
of Nations; Norms; Power: Political; Punishment, Comparative Politics of; Punishment: Social and Legal Aspects; Utilitarianism: Contemporary Applications
Bibliography Baldwin D A 1971 The power of positive sanctions. World Politics 24: 19–38 Baldwin D A 1985 Economic Statecraft. Princeton University Press, Princeton, NJ Baldwin D A 1989 Paradoxes of Power. Basil Blackwell, New York Baldwin D A 2000 The sanctions debate and the logic of choice. International Security 24: 80–107 Barry B 1995 Justice as Impartiality. Clarendon Press, Oxford, UK Blau P M 1986 Exchange and Power in Social Life. Transaction, New Brunswick, NJ Boulding K E 1989 Three Faces of Power. Sage, Newbury, CA Coleman J S 1990 Foundations of Social Theory. Belknap, Cambridge, MA Davis J W 2000 Threats and Promises: The Pursuit of International Influence. The Johns Hopkins University Press, Baltimore, MD Hufbauer G C, Schott J J, Elliott K A 1990 Economic Sanctions Reconsidered, 2nd edn. Institute for International Economics, Washington, DC Lasswell H D, Kaplan A 1950 Power and Society: A Framework for Political Inquiry. Yale University Press, New Haven, CT Morgan T C, Schwebach V L 1997 Fools suffer gladly: The use of economic sanctions in international crises. International Studies Quarterly 41: 27–50 Newnham R E 2000 More flies with honey: Positive economic linkage in German Ostpolitik from Bismarck to Kohl. International Studies Quarterly 44: 73–96 Oppenheim F E 1981 Political Concepts: A Reconstruction. University of Chicago Press, Chicago Pape R A 1997 Why economic sanctions do not work. International Security 22: 90–136 Parsons T 1963 On the concept of political power. Proceedings of the American Philosophical Society 107: 232–62 Schelling T C 1960 The Strategy of Conflict. Harvard University Press, Cambridge, MA Tsebelis G 1990 Are sanctions effective? A game-theoretic analysis. Journal of Conflict Resolution 34: 3–28 Waldron J 1994 Kagan on requirements: Mill on sanctions. Ethics 104: 310–24
D. A. Baldwin
Sapir, Edward (1884–1939) 1. General Career Edward Sapir was born in Lauenburg, Germany (now Lebork, Poland) on January 26, 1884, but his family emigrated to the United States when he was five years old and eventually settled in New York City. His
Sapir, Edward (1884–1939) brilliance earned him a full scholarship to the prestigious Horace Mann School and subsequently a Pulitzer fellowship to Columbia College, where he received his BA in 1904. He continued with graduate work in Germanic philology, but was soon drawn into Franz Boas’s orbit and took up an anthropological career, for which he proved extraordinarily well fitted. In 1905 he began a series of field visits to the West that resulted in the detailed documentation of several American Indian languages and cultures, including Wishram Chinook, Takelma, Yana, Southern Paiute, and Nootka. The speed and accuracy with which Sapir collected linguistic and ethnographic data has probably never been surpassed, and his notes, even from his earliest field trips, are among the most valuable manuscripts in American Indian studies. After receiving his doctorate from Columbia in 1909 with a dissertation on the Takelma language, Sapir taught briefly at the University of Pennsylvania, and in 1910 became Chief of the Division of Anthropology in the Geological Survey of Canada. He remained in this position until 1925, making the Nootka (Nuuchahnulth) language the focus of his research from 1910 to 1914. Between 1915 and 1920, when World War I and its aftermath brought field work to a halt, he devoted a considerable amount of time to the comparative linguistics of North American languages, establishing the Na-Dene relationship between Athabaskan, Tlingit, and Haida, and proposing expansions of the Penutian and Hokan stocks to include a large number of languages in North and Central America. During this period he also began making a name for himself as a literary and social commentator, and began publishing his experimental poetry. Sapir married Florence Delson in 1911, and they had three children. Shortly after the birth of their third child in 1918, Florence Sapir began to manifest signs of serious illness, partly psychological in character. Although his wife’s deteriorating health became a great concern to Sapir, and her death in 1924 was emotionally devastating, he remained committed to the intensive field documentation of American Indian languages, and in 1922 commenced an Athabaskan research program that eventually encompassed fullscale descriptive studies of Sarsi, Kutchin (Gwich’in), Hupa, and Navajo. This work was motivated partly by Sapir’s conviction that a historical relationship could be demonstrated between the Na-Dene languages of North America and the Sino-Tibetan languages of East Asia. After the publication of his highly influential book Language (1921) Sapir was regarded widely as one of the leading linguists of his generation. In 1925 he accepted Fay-Cooper Cole’s offer of an academic position in the newly reorganized Department of Anthropology and Sociology at the University of Chicago. The move from Ottawa, and his second marriage to Jean McClenaghan, with whom he was to
have two more children, was an emotional and intellectual watershed. He found the interdisciplinary atmosphere at Chicago stimulating and increasingly he addressed general and theoretical topics in his professional writing. He also began contributing sprightly and provocative essays on general social and cultural topics to such publications as Mencken’s American Mercury. While much of his teaching was in linguistics and he took a leading role in establishing that discipline as an autonomous field of study, Sapir also became involved in developing a general model for social science. After he moved to Yale in 1931 as Sterling Professor of Anthropology and Linguistics he became interested particularly in a psychologically realistic paradigm for social research (see Irvine 1999) and led a seminar on personality and culture. Meanwhile linguistic work on Athabaskan, specifically Navajo, continued to absorb him, and the prospect of finding remote connections of the Na-Dene languages in Asia had led him to the study of Tibetan and Chinese. This extraordinarily diverse agenda was abruptly suspended in the summer of 1937 when Sapir suffered a serious heart attack, from which he never fully recovered. After a year and a half of precarious health and restricted activity he died in New Haven, Connecticut on February 4, 1939, a few days after his 55th birthday.
2. Scientific Contribution Sapir’s scientific work can be divided into three distinct parts. First, there is his substantive work in descriptive and comparative linguistics, almost entirely devoted to North American Indian languages. Second, there is his role in establishing the paradigm for twentieth century linguistic research. Finally, there are the flashes of insight—seldom elaborated into formal hypotheses—with which Sapir from time to time illuminated the landscape of linguistic and social theory.
2.1 Substantie Work on Languages The face that American Indian linguistics presents to the world at the beginning of the twenty-first century probably owes more to Sapir than to any other scholar. When Sapir took up the anthropological study of American Indian languages in 1905, the field was dominated by the classificatory concerns of John Wesley Powell’s Bureau of American Ethnology, which saw as its principal task the identification of language and dialect boundaries, and the grouping of languages into families whose historical relationship was undoubted. Only a few earlier scholars like Pickering and Humboldt had been interested in the 13483
Sapir, Edward (1884–1939) general philological study of these languages. It was Franz Boas, Sapir’s mentor at Columbia, who first proposed making comparative investigation of the grammatical structures of American Indian languages (and of non-European languages generally) a topic for sustained scientific research. Drawing his model from a German tradition of associating language and social behavior that led back through Steinthal to Humboldt, Boas impressed upon his anthropological students the necessity of understanding how linguistic ‘morphology’ (by which he meant grammatical structure) channeled ideas into expressive forms. Unfortunately for Boas, few of his students were equipped either by training or by intellectual inclination to carry out linguistic research of a more than superficial kind. The only significant exception was Sapir, whom one may imagine was attracted to Boas’s anthropology for precisely this reason. From his earliest fieldwork on Wishram Chinook in 1905, Sapir made grammatical analysis the centerpiece of his research, and from his first publications portrayed American Indian languages with a descriptive clarity they had seldom before enjoyed. His accomplishments are legendary to the scholars who today study these languages, and must rank in the first tier of the grammarian’s art. The most renowned of his published studies include a full grammar of Takelma, a now-extinct language of southern Oregon (1922); a full grammar and dictionary of Southern Paiute (1930–31\1992); and an outline grammar of Nootka, with texts and vocabulary, prepared with the assistance of Swadesh (1939). Sapir’s grammatical descriptions are couched in a metalanguage derived from European comparative philology, tempered by Boas’s insistence that the structure of every language is sui generis. In discussions of the evolution of grammatical theory, Sapir’s grammars are sometimes portrayed as ‘processual’ or as early examples of ‘generative’ descriptions, but such labels imply a theoretical deliberation that was uncharacteristic of Sapir’s work. His concern was to explicate the patterns of the language under consideration as lucidly and unambiguously as possible, not to test a general theory of linguistic structure. The aptness with which Sapir’s descriptions captured the spirit of the languages he analyzed is no better illustrated than by his model of Athabaskan grammar. Although in this case it was not embodied in a full grammatical treatment, he laid out the basic features of his descriptive system for Athabaskan in a number of shorter works, and passed it on to his students and successors through his teaching and in his files. Sixty years after his death it remains the standard descriptive model for all work in Athabaskan linguistics, regardless of the theoretical stance of the analyst. Throughout his career Sapir maintained a deep interest in historical relationships among languages. He took pride in extending the rigor of the reconstructive method to American Indian language 13484
families, and laid the foundations for the comparative study of both Uto-Aztecan and Athabaskan. In the 1930s he returned to Indo-European linguistics and made major contributions to the Laryngeal Hypothesis (the proposal, originating with Ferdinand de Saussure, that the phonology of Proto-Indo-European included one or more laryngeal or pharyngeal consonants not attested in the extant IndoEuropean languages).
2.2
The Professionalization of American Linguistics
By 1921, Sapir was able to draw on the analytic details of the American Indian languages on which he had worked to illustrate, in his book Language, the wide variety of grammatical structures represented in human speech. One of the most significant impacts of this highly successful book was to provide a model for the professionalization of academic linguistics in the US during the 1920s and early 1930s. Under Sapir’s guidance, a distinctive American School of linguistics arose, focused on the empirical documentation of language, primarily in field situations. Although most of the students Sapir himself trained at Chicago and Yale worked largely if not exclusively on American Indian languages, the methods that were developed were transferable to other languages. In the mid-1930s Sapir directed a project to analyze English, and during World War II many of Sapir’s former students were recruited to use linguistic methods to develop teaching materials for such strategically important languages as Thai, Burmese, Mandarin, and Russian. A distinction is sometimes drawn between the prewar generation of American linguists—dominated by Sapir and his students and emphasizing holistic descriptions of American Indian languages—and the immediate postwar generation, whose more rigid and focused formal methods were codified by Leonard Bloomfield and others. Since many of the major figures of ‘Bloomfieldian’ linguistics were Sapir’s students, this distinction is somewhat artificial, and a single American Structuralist tradition can be identified extending from the late 1920s through 1960 (Hymes and Fought 1981). There is little doubt that Sapir’s influence on this tradition was decisive.
3. Theoretical Insights From his earliest work under Boas, Sapir’s independent intellectual style often carried him well beyond the bounds of the academic paradigm he was ostensibly working within. He was not, however, a programmatic thinker, and his groundbreaking work, while deeply admired by his students and colleagues, seldom resulted in significant institutional changes, at least in the short term. The cliche! ‘ahead of his time’ is
Sapir, Edward (1884–1939) especially apt in Sapir’s case, and the influence of some of his views continues to be felt. This is most striking in structural linguistics, where Sapir must certainly be accounted one of the most influential figures of the twentieth century. As early as 1910 Sapir was commenting on the importance of formal patterning in phonology, and in the 1920s he was among the first to enunciate the ‘phonemic principle’ which later figured importantly in his teaching at Chicago and Yale. Characteristically, he left it to his students and such colleagues as Leonard Bloomfield to formalize an analytic methodology, while he himself pursued the psychological implications of formal patterning. Sapir’s (1917) trenchant critique of the culture concept, particularly as defined by A. L. Kroeber, was largely ignored at the time. Sapir argued that the attribution of cultural patterning to an emergent ‘superorganic’ collective consciousness was an intellectual dead-end, and that research would be better directed at understanding the individual psychology of collective patterned behavior. While Kroeber’s view was undoubtedly the dominant one in anthropology and general social science for much of the twentieth century, Sapir’s analysis is much more consistent with recent models of human sociocultural behavior that have been developed by evolutionary psychologists. The fact that Sapir’s views came from someone with extraordinary insight into the elaborate self-referential patterns of language—traditionally, the most formalized and objectified of social behaviors—can hardly be accidental, and again is consistent with recent developments in cognitive science. In the late 1920s Sapir found an important intellectual ally in Harry Stack Sullivan, a psychiatrist whose interpersonal theory of the genesis of schizophrenia resonated with Sapir’s views (Perry 1982). The two became close friends, and together with Harold Lasswell they planned and organized the William Alanson White Psychiatric Foundation, a research and teaching institution that ultimately was located in Washington, DC. In the year before he died Sapir gave serious consideration to leaving Yale and working with Sullivan in a research position at the Foundation. Sapir’s views on the history of language were linked to his view of the abstraction of patterns, and were equally controversial or misunderstood. A distinction must be drawn between Sapir’s work, noted earlier, as a historical linguist within a family of languages whose relationship was secure (Athabaskan, Uto-Aztecan, Indo-European), and his explorations of much less certain relationships (Hokan, Penutian, Na-Dene) and possible interhemispheric connections (Sino-Dene). In the former, Sapir worked—with characteristic creativity and insight—with tools and models derived from a long tradition of comparative linguistics. In the latter, it was (or seemed to his contemporaries) often a matter of brilliant intuition. In fact, in this work he
usually relied on an assessment of similarities in structural pattern that distinguished features susceptible to unconscious change from generation to generation (e.g., regular inflectional patterns, words) from those that are largely inaccessible to individual cognition. Although he never offered a theoretical explication, his idea of what constituted ‘deep’ linguistic patterns was exemplified in his classification of North and Central American Indian languages (1929). This classification, which remains influential, is still untested in its own terms.
4. Impact and Current Importance A charismatic teacher, Sapir had a succession of highly motivated students both at Chicago and at Yale, the most prominent among them Morris Swadesh (who collaborated with him on Nootka research), Harry Hoijer (who codified Sapir’s analysis of Athabaskan grammar), Mary R. Haas, Stanley Newman, C. F. Voegelin, George L. Trager, Zellig Harris, David G. Mandelbaum, and Benjamin L. Whorf. Through these students Sapir exercised a considerable posthumous influence on intellectual and institutional developments in both linguistics and in anthropology through the 1960s. A postwar collection of Sapir’s most important general papers, edited by Mandelbaum (1949), was widely read and is still consulted. Harris’ (1951) extended review of this book provides a comprehensive summary of Sapir’s oeure as it was understood by his immediate circle. Sapir is cited most frequently today for the ‘Sapir–Whorf Hypothesis’ of linguistic relativity, a name that inaccurately implies an intellectual collaboration between Sapir and his Yale student, Benjamin Whorf, who himself died in 1941. The more correctly designated ‘Whorf theory complex’ (Lee 1996) was a retrospective construct that derived largely from writings of Whorf’s that were unpublished at the time of Sapir’s death. Although undoubtedly stimulated by Sapir’s writing and teaching, Whorf’s proposal that the structure of a language to some extent determines the cognitive and behavioral habits of its speakers cannot be connected directly with Sapir’s mature thought on the psychology of language and culture. Sapir’s most enduring achievement is his own descriptive linguistic work. Long after the writings of most of his contemporaries have been forgotten, Sapir’s grammatical studies continue to be held in the highest esteem. In recent years his holistic analytic technique has been emulated by a number of linguists seeking an alternative to narrow formalism, particularly when working with American Indian or other indigenous languages. Sapir’s life and work was the subject of a 1984 conference (Cowan et al. 1986), from which emerged a plan to publish a standard edition of all of Sapir’s work, including edited versions of unfinished manu13485
Sapir, Edward (1884–1939) scripts; by year 2000 seven volumes had appeared. A biography by Darnell (1990) is useful for the externals of Sapir’s career, but her reluctance to give an intellectual account of Sapir’s work, particularly in linguistics, leaves some important issues still to be addressed (Silverstein 1991). See also: Archaeology and the History of Languages; Historical Linguistics: Overview; Language and Ethnicity; Language and Thought: The Modern Whorfian Hypothesis; Linguistic Anthropology; Linguistic Fieldwork; Linguistic Typology; Linguistics: Comparative Method; North America and Native Americans: Sociocultural Aspects; North America, Archaeology of; North America: Sociocultural Aspects; Phonology; Population Composition by Race and Ethnicity: North America; Psycholinguistics: Overview; Sapir–Whorf Hypothesis; Sociolinguistics
Bibliography Cowan W, Foster M K, Koerner K (eds.) 1986 New Perspecties in Language, Culture and Personality. Benjamins, Amsterdam and Philadelphia Darnell R 1990 Edward Sapir: Linguist, Anthropologist, Humanist. University of California Press, Berkeley and Los Angeles, CA Harris Z S 1951 Review of D G Mandelbaum (ed.). Selected writings of Edward Sapir in language, culture, and personality. Language 27: 288–333 Hymes D, Fought J 1981 American Structuralism. Mouton, The Hague, The Netherlands Irvine J T (ed.) 1999 The psychology of culture: A course of lectures by Edward Sapir, 1927–1937. In: The Collected Works of Edward Sapir, Vol. 3, Culture. Mouton de Gruyter, Berlin and New York, pp. 385–686 [also published as a separate volume, The Psychology of Culture, Mouton de Gruyter 1993] Lee P 1996 The Whorf Theory Complex: A Critical Reconstruction. Benjamins, Amsterdam and Philadelphia Mandelbaum D M (ed.) 1949 Selected Writings of Edward Sapir in Language, Culture, and Personality. University of California Press, Berkeley and Los Angeles, CA Perry H S 1982 Psychiatrist of America: The Life of Harry Stack Sullian. Belknap Press, Cambridge, UK and London Sapir E 1917 Do we need a ‘superorganic’? American Anthropologist 19: 441–47 Sapir E 1921 Language. Harcourt Brace, New York Sapir E 1922 The Takelma language of Southwestern Oregon. In: Boas F (ed.) Handbook of American Indian Languages, Part 2. Bureau of American Ethnology, Washington, DC Sapir E 1929 Central and North American Indian languages. Encyclopaedia Britannica, 14th edn. Vol. 5, pp. 138–41 Sapir E 1930–31\1992 The Southern Paiute Language. The Collected Works of Edward Sapir, Vol. 10. Mouton de Gruyter, Berlin and New York Sapir E, Swadesh M 1939 Nootka Texts: Tales and Ethnological Narraties with Grammatical Notes and Lexical Materials. Linguistic Society of America, Philadelphia Silverstein M 1991 Problems of Sapir historiography. Historiographia Linguistica 18: 181–204
V. Golla 13486
Sapir–Whorf Hypothesis 1. Nature and Scope of the Hypothesis The Sapir–Whorf hypothesis, also known as the linguistic relativity hypothesis, refers to the proposal that the particular language one speaks influences the way one thinks about reality. Although proposals concerning linguistic relativity have long been debated, American linguists Edward Sapir (1884–1939) and Benjamin Lee Whorf (1897–1941) advanced particularly influential formulations during the second quarter of the twentieth century, and the topic has since become associated with their names. The linguistic relativity hypothesis focuses on structural differences among natural languages such as Hopi, Chinese, and English, and asks whether the classifications of reality implicit in such structures affect our thinking about reality more generally. Analytically, linguistic relativity as an issue stands between two others: a semiotic-level concern with how speaking any natural language whatsoever might influence the general potential for human thinking (i.e., the general role of natural language in the evolution or development of human intellectual functioning), and a functional- or discourse-level concern with how using any given language code in a particular way might influence thinking (i.e., the impact of special discursive practices such as schooling and literacy on formal thought). Although analytically distinct, the three issues are intimately related in both theory and practice. For example, claims about linguistic relativity depend on understanding the general psychological mechanisms linking language to thinking, and on understanding the diverse uses of speech in discourse to accomplish acts of descriptive reference. Hence, the relation of particular linguistic structures to patterns of thinking forms only one part of the broader array of questions about the significance of language for thought. Proposals of linguistic relativity necessarily develop two linked claims among the key terms of the hypothesis (i.e., language, thought, and reality). First, languages differ significantly in their interpretations of experienced reality—both what they select for representation and how they arrange it. Second, language interpretations have influences on thought about reality more generally—whether at the individual or cultural level. Claims for linguistic relativity thus require both articulating the contrasting interpretations of reality latent in the structures of different languages, and assessing their broader influences on, or relationships to, the cognitive interpretation of reality. Simple demonstrations of linguistic diersity are sometimes mistakenly regarded as sufficient in themselves to prove linguistic relativity, but they cannot in themselves show that the language differences affect thought more generally. (Much confusion arises in this
Sapir–Whorf Hypothesis regard because of the practice in linguistics of describing the meaningful significance of individual elements in a language as ‘relative to’ the grammatical system as a whole. But this latter relativity of the meaning of linguistic elements to the encompassing linguistic structure should be distinguished from broader claims for a relativity of thought more generally to the form of the speaker’s language.) A variety of other arguments to the effect that distinctive perceptual or cognitive skills are required to produce and comprehend different languages likewise usually fail to establish any general effects on thought (see Niemeier and Dirven 2000). Linguistic relativity proposals are sometimes characterized as equivalent to linguistic determinism, that is the view that all thought is strictly determined by language. Such characterizations of the language– thought linkage bear little resemblance to the proposals of Sapir or Whorf, who spoke in more general terms about language influencing habitual patterns of thought, especially at the conceptual level. Indeed, no serious scholar working on the linguistic relativity problem as such has subscribed to a strict determinism. (There are, of course, some who simply equate language and thought, but under this assumption of identity, the question of influence or determinism is no longer relevant.) Between the patent linguistic diversity that nearly everyone agrees exists and a claim of linguistic determinism that no one actually espouses, lies the proposal of linguistic relativity, that is the proposal that our thought may in some way be taken as relative to the language spoken.
2. Historical Deelopment of the Hypothesis Interest in the intellectual significance of the diversity of language categories has deep roots in the European tradition (Aarsleff 1988, Werlen 1989, Koerner 1992). Formulations related to contemporary ones appear during the Enlightenment period in the UK (Locke), France (Condillac, Diderot), and Germany (Hamman, Herder). They are stimulated variously by opposition to the universal grammarians, by concerns about the reliability of language-based knowledge, and by practical efforts to consolidate national identities and cope with colonial expansion. Most of this work construes the differences among languages in terms of a hierarchical scheme of adequacy with respect to reality, to reason, or to both. Later, nineteenth-century work in Germany by Humboldt and in France\Switzerland by Saussure drew heavily on this earlier tradition and set the stage for the approaches of Sapir and Whorf. Humboldt’s arguments, in particular, are often regarded as anticipating the Sapir–Whorf approach. He argued for a linguistic relativity according to the formal processes used by a language (e.g., inflection, agglutination, etc.). Ultimately this remains a hierarchical relativity in which certain language types (i.e.,
European inflectional ones) are viewed as more adequate vehicles of thought and civilization—a view distinctly at odds with what is to follow. Working within the US anthropological tradition of Franz Boas and stimulated by the diversity and complexity of Native American languages, Edward Sapir (1949) and Benjamin Lee Whorf (1956) reinvigorated and reoriented investigation of linguistic relativity in several ways (Lucy 1992a, Lee 1997). First, they advocated intensive first-hand scientific investigation of exotic languages; second, they focused on structures of meaning, rather than on formal grammatical process such as inflection; and third, they approached these languages within a framework of egalitarian regard. Although not always well understood, theirs is the tradition of linguistic relativity most widely known and debated today. Whorf’s writings, in particular, form the canonical starting point for all subsequent discussion. Whorf proposed a specific mechanism for how language influences thought, sought empirical evidence for language effects, and articulated the reflexive implications of linguistic relativity for scholarly thought itself. In his view, each language refers to an infinite variety of experiences with a finite array of formal categories (both lexical and grammatical) by grouping experiences together as analogically ‘the same’ for the purposes of speech. The categories in a language also interrelate in a coherent way, reinforcing and complementing one another, so as to constitute an overall interpretation of experience. These linguistic classifications vary considerably across languages not only in the basic distinctions they recognize but also in the assemblage of these categories into a coherent system of reference. Thus the system of categories which each language provides to its speakers is not a common, universal system, but a particular ‘fashion of speaking.’ Whorf argued that these linguistic structures influence habitual thought by serving as a guide to the interpretation of experience. Speakers tend to assume that the categories and distinctions of their language are entirely natural and given by external reality, and thus can be used as a guide to it. When speakers attempt to interpret an experience in terms of a category available in their language, they unwittingly involve other language-specific meanings implicit in that particular category and in the overall configuration of categories in which it is embedded. In Whorf’s view language does not blind speakers to some obvious reality, but rather it suggests associations which are not necessarily entailed by experience. Because language is such a pervasive and transparent aspect of behavior, speakers do not understand that the associations they ‘see’ are from language, but rather assume that they are ‘in’ the external situation and patently obvious to all. In the absence of another language (natural or artificial) with which to talk about experience, speakers will not be able to recognize 13487
Sapir–Whorf Hypothesis the conventional nature of their linguistically based understandings. Whorf argues that by influencing everyday habitual thought in this way, language can come to influence cultural institutions generally, including philosophical and scientific activity. In his empirical research Whorf showed that the Hopi and English languages treat ‘time’ differently, and that this difference corresponds to distinct cultural orientations toward temporal notions. Specifically, Whorf argued that speakers of English treat cyclic experiences of various sorts (e.g., the passage of a day or a year) in the same grammatical frame used for ordinary object nouns. Thus, English speakers are led to treat these cycles as object-like in that they can be measured and counted just like tangible objects. English also treats objects as if they each have a form and a substance. Since the cyclic words get put into this object frame, English speakers are led to ask what is the substance associated with the forms a day, a year, and so forth. Whorf argues that our global, abstract notion of ‘time’ as a continuous, homogeneous, formless something can be seen to arise to fill in the blank in this linguistic analogy. The Hopi, by contrast, do not treat these cycles as objects but as recurrent events. Thus, although they have, as Whorf acknowledged, words for what English speakers would recognize as temporal cycles (e.g., days, years, etc.), the formal analogical structuration of these terms in their grammar does not give rise to the abstract notion of ‘time’ that English speakers have. (Ironically, critics of Whorf ’s Hopi data often miss his point about structural analogy and focus narrowly on individual lexical items.) Finally, grouping referents and concepts as formally ‘the same’ for the purposes of speech has led speakers to group those referents and concepts as substantively ‘the same’ for action generally, as evidenced by related cultural patterns of belief and behavior he describes.
3. Empirical Research on the Hypothesis Although the Sapir–Whorf proposal has had wide impact on thinking in the humanities and social sciences, it has not been extensively investigated empirically. Indeed, some believe it is too difficult, if not impossible in principle, to investigate. Further, a good deal of the empirical work that was first developed was quite narrowly confined to attacking Whorf ’s analyses, documenting particular cases of language diversity, or exploring the implications in domains such as color terms that represent somewhat marginal aspects of language structure. In large part, therefore, acceptance or rejection of the proposal for many years depended more on personal and professional predilections than on solid evidence. Nonetheless, a variety of modern initiatives have stimulated renewed interest in mounting empirical assessments of the hypothesis. 13488
Contemporary empirical efforts can be classed into three broad types, depending on which of the three key terms in the hypothesis they take as their point of departure: language, reality, or thought (Lucy 1997). A structure-centered approach begins with an observed difference between languages, elaborates the interpretations of reality implicit in them, and then seeks evidence for their influence on thought. The approach remains open to unexpected interpretations of reality but often has difficulty establishing a neutral basis for comparison. The classic example of a language-centered approach is Whorf ’s pioneering comparison of Hopi and English described above. The most extensive contemporary effort to extend and improve the comparative fundamentals in a structurecentered approach has sought to establish a relation between variations in grammatical number marking and attentiveness to number and shape (Lucy 1992b). This research remedies some of the traditional difficulties of structure-centered approaches by framing the linguistic analysis typologically so as to enhance comparison, and by supplementing ethnographic observation with a rigorous assessment of individual thought. This then makes possible the realization of the benefits of the structure-centered approach: placing the languages at issue on an equal footing, exploring semantically significant lexical and grammatical patterns, and developing connections to interrelated semantic patterns in each language. A domain-centered approach begins with a domain of experienced reality, typically characterized independently of language(s), and asks how various languages select from, encode, and organize it. Typically, speakers of different languages are asked to refer to ‘the same’ materials or situations so the different linguistic construals become clear. The approach facilitates controlled comparison, but often at the expense of regimenting the linguistic data rather narrowly. The classic example of this approach, developed by Roger Brown and Eric Lenneberg in the 1950s, showed that some colors are more lexically encodable than others, and that more codable colors are remembered better. This line of research was later extended by Brent Berlin, Paul Kay, and their colleagues, but to argue instead that there are crosslinguistic uniersals in the encoding of the color domain such that a small number of ‘basic’ color terms emerge in languages as a function of biological constraints. Although this research has been widely accepted as evidence against the validity of linguistic relativity hypothesis, it actually deals largely with constraints on linguistic diversity rather than with relativity as such. Subsequent research has challenged Berlin and Kay’s universal semantic claim, and shown that different color-term systems do in fact influence color categorization and memory. (For discussions and references, see Lucy 1992a, Hardin and Maffi 1997, Roberson et al. 2000.) The most successful effort to improve the quality of the linguistic comparison in
Sapir–Whorf Hypothesis a domain-centered approach has sought to show cognitive differences in the spatial domain between languages favoring the use of body coordinates to describe arrangements of objects (e.g., ‘the man is left of the tree’) and those favoring systems anchored in cardinal direction terms or topographic features (e.g., ‘the man is east\uphill of the tree’) (Pederson et al. 1998, Levinson in press). This research on space remedies some of the traditional difficulties of domaincentered approaches by developing a more rigorous and substantive linguistic analysis to complement the ready comparisons facilitated by this approach. A behaior-centered approach begins with a marked difference in behavior which the researcher comes to believe has its roots in and provides evidence for a pattern of thought arising from language practices. The behavior at issue typically has clear practical consequences (either for theory or for native speakers), but since the research does not begin with an intent to address the linguistic relativity question, the theoretical and empirical analyses of language and reality are often weakly developed. The most famous example of a behavior-centered approach is the effort to account for differences in Chinese and English speakers’ facility with counterfactual or hypothetical reasoning by reference to the marking of counterfactuals in the two languages (Bloom 1981). The interpretation of these results remains controversial (Lucy 1992a).
4. Future Prospects Two research trends are unfolding at the present time. First, in the cognitive and psychological sciences awareness is increasing of the nature and scope of language differences. This has led to a greater number of studies focused on the possible cognitive consequences of such differences (e.g., Levinson in press, Niemeier and Dirven 2000). Second, an increasing integration is emerging among the three levels of language and thought problem (i.e., the semiotic, structural, and functional levels). On the semiotic side, for example, research on the relationship between language and thought in development is increasingly informing and informed by work on linguistic relativity (Bowerman and Levinson 2001). On the functional side, research on the relationship of cultural and discursive patterns of use is increasingly being brought into dialogue with Whorfian issues (Silverstein 1979, Friedrich 1986, Wierzbicka 1992, Hill and Mannheim 1992, Gumperz and Levinson 1996). The continued relevance of the linguistic relativity issue seems assured by the same impulses found historically: the patent relevance of language to human sociality and intellect, the reflexive concern with the role of language in scholarly practice, and the practical encounter with linguistic diversity. To this we must
add the increasing concern with the unknown implications for human thought of the impending loss of many if not most of the world’s languages (Fishman 1982). See also: Cognitive Psychology: History; Human Cognition, Evolution of; Language and Philosophy; Language and Thought: The Modern Whorfian Hypothesis; Linguistic Anthropology; Linguistics: Overview; Sapir, Edward (1884–1939); Semiotics; Wittgenstein, Ludwig (1889–1951)
Bibliography Aarsleff H 1988 Introduction. In: Von Humboldt W On Language: The Diersity of Human Language-structure and its Influence on the Mental Deelopment of Mankind (Heath P, trans.). Cambridge University Press, Cambridge, UK, pp. vii–xv Bloom A H 1981 The Linguistic Shaping of Thought. Lawrence Erlbaum, Hillsdale, NJ Bowerman M, Levinson S C 2001 Language Acquisition and Conceptual Deelopment. Cambridge University Press, Cambridge, UK Fishman J 1982 Whorfianism of the third kind: Ethnolinguistic diversity as a worldwide societal asset (The Whorfian Hypothesis: Varieties of validation, confirmation, and disconfirmation II). Language in Society 11: 1–14 Friedrich P 1986 The Language Parallax: Linguistic Relatiism and Poetic Indeterminacy. University of Texas, Austin, TX Gumperz J J, Levinson S C (eds.) 1996 Rethinking Linguistic Relatiity. Cambridge University Press, Cambridge, UK Hardin C L, Maffi L (eds.) 1997 Color Categories in Thought and Language. Cambridge University Press, Cambridge, UK Hill J H, Mannheim B 1992 Language and world view. Annual Reiew of Anthropology 21: 381–406 Koerner E F K 1992 The Sapir–Whorf hypothesis: A preliminary history and a bibliographic essay. Journal of Linguistic Anthropology 2: 173–8 Lee P 1997 The Whorf Theory Complex: A Critical Reconstruction. John Benjamins, Amsterdam, The Netherlands Levinson S C in press Space in Language and Cognition: Explorations in Cognitie Diersity. Cambridge University Press, Cambridge, UK Lucy J A 1992a Language Diersity and Thought: A Reformulation of the Linguistic Relatiity Hypothesis. Cambridge University Press, Cambridge, UK Lucy J A 1992b Grammatical Categories and Cognition: A Case Study of the Linguistic Relatiity Hypothesis. Cambridge University Press, Cambridge, UK Lucy J A 1997 Linguistic relativity. Annual Reiew of Anthropology 26: 291–312 Niemeier S, Dirven R (eds.) 2000 Eidence for Linguistic Relatiity. John Benjamins, Amsterdam, The Netherlands Pederson E, Danziger E, Wilkins D, Levinson S, Kita S, Senft G 1998 Semantic typology and spatial conceptualization. Language 74: 508–56 Roberson D, Davies I, Davidoff J 2000 Color categories are not universal: Replications and new evidence from a stone-age culture. Journal of Experimental Psychology—General 129: 369–98
13489
Sapir–Whorf Hypothesis Sapir E 1949 The Selected Writings of Edward Sapir in Language, Culture, and Personality (Mandelbaum D G, ed.). University California Press, Berkeley, CA Silverstein M 1979 Language structure and linguistic ideology. In: Clyne P, Hanks W, Hofbauer C (eds.) The Elements: A Parasession on Linguistic Units and Leels. Chicago Linguistic Society, Chicago, pp. 193–247 Werlen I 1989 Sprache, Mensch und Welt: Geschichte und Bedeutung des Prinzips der sprachlichen RelatiitaW t. Wissenschaftliche Buchgesellschaft, Darmstadt, Germany Whorf B L 1956 Language, Thought, and Reality: Selected Writings of Benjamin Lee Whorf (Carroll J B, ed.). MIT Press, Cambridge, MA Wierzbicka A 1992 Semantics, Culture, and Cognition. Uniersal Human Concepts in Culture-specific Configurations. Oxford University Press, Oxford, UK
J. A. Lucy
Sauer, Carl Ortwin (1889–1975) Carl Sauer was one of the towering intellectual figures of the twentieth century, not only in geography but also in a wider intellectual sphere. However, because of his wide-ranging thought, speculative sweep, and world perspective over a number of subjects, it is difficult to define his contribution to the social and behavioral sciences precisely and easily. But one can say that some of his underlying concerns were about scholarship, independence of thought, opposition to academic bureaucracy, a sympathy and identification with rural folk, concern for cultural diversity and environmental quality, and a distaste for the technological and scientific ‘fix,’ particularly the solutions offered by the emerging social sciences after 1945. He was born in Warrenton, Missouri on December 24, 1889 of German parents, and because of his background and three years of schooling in Calw (near Stuttgart) he was influenced by German culture and literature. He completed his graduate studies at Chicago University in 1915 under the geographer Ellen SempleChurchill,thegeologistRollinD.Salisbury,and plant ecologist Henry C. Cowles, the latter two of whom made a lasting intellectual impression on him. He then taught at the University of Michigan until 1923, when he moved to geography at Berkeley where he taught for 34 years (32 as Chair) and established one of the most distinctive graduate schools of American Geography that would always be associated with ‘cultural geography.’ After he retired in 1955 he enjoyed 20 remarkably productive years that saw the publication of four books and a score of influential papers, all distinguished by big and speculative ideas, that were the fruits of a lifetime’s reflection and unhurried reading. His reputation soared so that an ‘aura of sage, philosopher-king, and even oracle surrounded him’ (Hooson 1981, p. 166). He died in Berkeley on July 18, 1975. 13490
1. Cultural Geography Sauer rebelled against the sterile environmental determinism of contemporary geography with its emphasis on humans as response mechanisms to physical factors. If nothing else, his experience in the Economic Land Survey in the Michigan Cutovers had shown him that humans radically transformed the earth, often for the worse, and in the process created cultural landscapes. In his search for a new, humane geography, cultural anthropology seemed to offer a means of dealing with the diversity of humankind and its cultural landscapes through time. On arriving in Berkeley he found natural soul-mates in the anthropologists Alfred L. Kroeber and Robert H. Lowie. The concept of ‘culture’ subsequently pervaded all his teaching and writing. In The Morphology of Landscape he distilled an almost wholly German geographical literature, established the primacy of human agency in the formation of cultural landscapes ‘fashioned out of the natural landscape by a cultural group,’ and the importance of a time-based approach. In addition, he placed great importance on observation and contemplation in the field—a Verstehen or empathetic understanding and intuitive insight into behavior or object in order to achieve ‘a quality of reasoning at a higher plane’ than the tangible facts (see Sauer 1925, Williams 1983). Sauer wrote ‘Morphology’ in order to ‘emancipate’ himself from determinist thinking and, with a few more papers on the cultural landscape of the midwest frontier, he put the epistemological game behind him and started substantive field work on early settlement and society in Lower California, Arizona and Mexico. During the 1930s he produced many works, including Aztatlan and The Road to Cibola both published in a new monograph series Ibero-Americana that he founded in the same year with Alfred Kroeber and the historian H. E. Bolton. (For these and many other of Sauer’s publications see Leighly 1963.) These investigations drew him into the controversy about New World plant domestication and plant origins, and into collaboration with botanists, archeologists and ethnologists, whom he found congenial intellectual company.
2. Widening Horizons All the time Sauer’s horizons were getting wider, his ideas more speculative, and his ethical values more refined. Toward the end of the 1930s he wrote a slight, but ultimately influential, paper which was a sustained and biting critique of the destructive social and environmental impact that resulted from the predatory outreach of Europe, which had few counterparts at that time except, perhaps, in the writing of Karl Marx (Sauer 1938). He drew inspiration from the work of George Perkins Marsh on the human transformation of the earth and Ernst Friedrich’s concept of Raub-
Sauer, Carl Ortwin (1889–1975) wirtschaft, or destructive exploitation (see Marsh [1865] 1965, Friedrich 1904). His experience and knowledge of Latin and Central America and their history suggested to him that the Spanish conquest had led to a devastating and permanent impoverishment of the land and of its cultures and societies. Disease, warfare and enslavement had disrupted traditional value systems. Thus, the diffusion of technologically superior societies could affect humans and their culture just as much as it could physical resources. But two works more than any others established his world reputation and heralded a remarkable decade of multifaceted yet interrelated speculative understanding of the place of humans on earth. First was Agricultural Origins and Dispersals (Sauer 1952a) that flowered later into a string of publications into the human uses of the organic world, and early humans in the Americas from the Ice Age onward. Unfortunately, radiocarbon dating came too late to inform Sauer’s writing, but although he may not have provided the answers, he defined the questions brilliantly. Second, in 1956, with the collaboration of Marston Bates and Lewis Mumford, he masterminded the Princetown symposium on ‘Man’s Role in Changing the Face of the Earth,’ the theme of which thereafter became his overriding interest. (See Mumford, Lewis (1895–1990).) All his learning and concerns culminated in this volume, and in his chapter ‘The Agency of Man on Earth’ (Sauer 1956). The capacity of humans to alter the natural environment—the ‘deformation of the pristine’—the cult of progress and waste that stemmed from mass production (‘commodity fetishism’), and the alien intrusion of humans into world ecology, were included. In contemporary terms, the theme was the degradation of the environment, and it was an early and influential statement. It also had another dimension: globally, the ‘imperialism of production’ was as bad as the old, colonial imperialism, and might ultimately be no better than Marxist totalitarianism; mass culture was eliminating not only biological diversity but also cultural diversity, and older and less robust societies. Somehow, humans had to rise above this mindless, shortterm exploitative mode. ‘The high moments of history have come not when man was concerned with the comforts and displays of the flesh but when his spirit was moved to grow in grace.’ Therefore, what was needed was ‘an ethic and aesthetic under which man, practising the qualities of prudence and moderation, may indeed pass on to posterity a good Earth’ (Sauer 1956, p. 68). His simply articulated ideas had a resonance with many activists and intellectuals, as well as Californian avant-garde poets and literati, who extolled his work as an example of cultural and ecological sensitivity and respect, tinged with deep historical insight and scholarship, and made attractive by his simple and pithy language. He also tapped a deep spring of feeling
during the 1960s and 1970s at the time of Vietnam and student unrest with their concerns at the limits of the earth and technological\political power. Yet Sauer was a complex mix. He was congenitally nonconformist but deeply conservative, and although profoundly concerned with conservation was never formally an ‘environmentalist,’ and indeed, he thought the movement was little more than an ‘ecological binge.’
3. Distrust of the Social Sciences In many ways it is ironical that Sauer has a place in this Encyclopedia because he had such a deep distrust and distaste for the behavioral and social sciences. Although he eschewed methodological and epistemological discussion, his writing, and particularly his personal correspondence, reveal that he was ‘a philosopher in spite of himself’ (Entrikin 1984). Culture history became his model of social science, and it was drawn from the natural sciences, not the social sciences. He was basically a pragmatist who was influenced by the writings of the German cultural geographers Friedrich Ratzel and Eduard Hahn, and by the methods of geology and anthropology and their emphasis on the provisional character of working hypotheses which were no more than a means to an end. His later association in Berkeley with people who worked on ‘tangible things,’ like plant ecologists, agricultural scientists, botanists (e.g., Ernest Babcock), experts in the evolution of population genetics (see Wright, Sewall (1889–1988)), and geneticists reinforced this. He argued for a theoretical and methodological pluralism that stemmed naturally from the inherent diversity of nature and culture. The natural science idea of a dynamic balance and diversity that arose from organic evolution was embedded deeply in his thought, and he felt that the modern world had disrupted these processes so that it was out of balance. Hence his deep distrust of US capitalism and all bureaucratic systems which would destroy diversity and local community. Liberal social scientists who designed, planned, and directed community life were more likely to destroy than enhance it by imposing universalizing concepts of social organization, and by ignoring the inherent pluralism and diversity of nature and culture that stimulated the naı$ ve curiosity about the world which was the essence of geography. Theoretical and normative social science as practiced by economists, sociologists, and political scientists, grated on him with its exaggerated confidence in the statistical and the inductive, and its ‘dialectic atmosphere.’ Their focus on the present precluded a better insight into the origins and evolution of any topic, and gave ‘an exaggerated accent on contemporaneity.’ His social science was based firmly in history and geography—this was culture history. 13491
Sauer, Carl Ortwin (1889–1975) During the late 1930s he talked jokingly of the two patron saints of social scientists—St. Bureaucraticus, who represented rationalism and professionalism in US academic life, and St. Scholasticus, who represented social theorists who sought normative generalizations about humankind and society. Consistently, these ‘Sons of Daedalus’ were the target of his criticism because their well-funded procedures and programs emasculated the independence of the impressionable younger scholars (Williams 1983). His address, ‘Folkways of the Social Sciences’ was an eloquent, if not audacious, plea to social scientists to ‘give back the search for truth and beauty to the individual scholar to grow in grace as best he can’ and to reinstate time and place into the study of the USA (Sauer 1952b). Sauer’s other beV te noire—mass production—promoted the homogenization of society, not only in the USA but globally, as the USA firmly assumed the position of superpower during the 1950s, with what he saw as a disdain for the ‘lesser breeds.’ Sauer was aware that he was ‘out of step’ with his colleagues and intellectuals, and spoke of himself as an ‘unofficial outsider’ and even as a ‘peasant.’ This has been typified as part of the antimodernism that characterized early twentieth century US intellectual life. But it went much deeper than that; his ideas anticipated society’s fears and disenchantment with progress, science, the elimination of diversity, and the degradation of the environment. Perhaps the ultimate relevance of Sauer’s work was in his groping towards a rapprochement between the social sciences, the humanities and the biological life sciences. By being behind he was far ahead. See also: Agricultural Sciences and Technology; Environmental Determinism; Geography; Place in Geography
Bibliography Entrikin J N 1984 Carl O. Sauer: Philosopher in spite of himself. Geographical Reiew 74: 387–408 Friedrich E 1904 Wesen und geographische Verbreitung der ‘Raubwirtschaft.’ Petermanns Mitteilungen 50: 68–79, 92–5 Hooson D 1981 Carl Ortwin Sauer. In: Blouet B W (ed.) The Origins of Academic Geography in the United States. Archon, Hamden, CT, pp. 165–74 Leighly J (ed.) 1963 Land and Life: A Selection from the Writings of Carl Ortwin Sauer. University of California Press, Berkeley, CA Marsh G P 1965 Man and Nature: Physical Geography as Modified by Human Action, Lowenthal D (ed.). Harvard University Press, Cambridge, MA Sauer C O 1925 The Morphology of Landscape. University of California Publications in Geography, Berkeley, CA, Vol. 2, pp. 19–53 Sauer C O 1938 Destructive exploitation in modern colonial expansion. Comptes Rendus du CongreZ s International de GeT ographie, Amsterdam 2 (Sect. 3c): 494–9
13492
Sauer C O 1952a Agricultural Origins and Dispersals. Bowman Memorial Lectures, Series 2. American Geographical Society, New York Sauer C O 1952b Folkways of social science. In: The Social Sciences at Mid-Century: Papers Deliered at the Dedication of Ford Hall, April 19–21, 1951. University of Minnesota Press, Minneapolis, MN Sauer C O 1956 The agency of man on earth. In: Thomas W L (ed.) Man’s Role in Changing the Face of the Earth. Chicago University Press, Chicago, IL, pp. 49–69 Williams M 1983 The apple of my eye: Carl Sauer and historical geography. Journal of Historical Geography 9: 1–28 Williams M 1987 Carl Sauer and man’s role in changing the face of the earth. Geographical Reiew 77: 218–31
M. Williams
Saussure, Ferdinand de (1857–1913) 1. Saussure’s Status in Twentieth-century Linguistics Saussure is best known for the posthumous compilation of lecture notes on general linguistics taken down assiduously by students attending his courses during 1907–1911, the Cours de linguistique geT neT rale, edited by his former students and junior colleagues and first published in 1916 (and since 1928 translated into more than a dozen languages). During his lifetime, Saussure was most widely known for his masterly MeT moire of 1878 devoted to an audacious reconstruction of the Proto-Indo-European vowel system. However, it is generally agreed that his Cours ushered in a revolution in linguistic thinking during the 1920s and 1930s which still at the beginning of the twentyfirst century is felt in many quarters, even beyond linguistics proper. He is widely regarded as ‘the father of structuralism’; to many his work produced a veritable ‘Copernican revolution’ (Holdcroft 1991, p. 134). Indeed, essential ingredients and terms of his theory have become points of reference for any serious discussion about the nature of language, its functioning, development, and uses.
2. Formatie Years and Career Saussure was born on November 26, 1857 in Geneva, Switzerland. Although from a distinguished Geneva family which—beginning with Horace Be! ne! dict de Saussure (1740–1799)—can boast of several generations of natural scientists, F. de Saussure was drawn early to language study, producing an ‘Essai pour reT duire les mots du grec, du latin et de l’allemand aZ un petit nombre de racines’ at age 14 or 15 (published in Cahiers Ferdinand de Saussure 32: 77–101 [1978]). Following his parents’ wishes, Saussure attended classes in chemistry, physics, and mathematics at the
Saussure, Ferdinand de (1857–1913) University of Geneva during 1875–1876, before being allowed to join his slightly older classmates who had left for Leipzig the year before. So in the fall of 1876 Saussure arrived at the university where a number of important works in the field of Indo-European phonology and morphology, including Karl Verner’s (1846–1896) epoch-making paper on the last remaining series of exceptions to ‘Grimm’s Law,’ had just been published. Saussure took courses with Georg Curtius (1820–1885), the mentor of the Junggrammatiker, and a number of the younger professors, such as August Leskien (1840–1916), Ernst Windisch (1844–1918), Heinrich Hu$ bschmann (1848–1908), Hermann Osthoff (1847–1909), and others in the field of Indic studies, Slavic, Baltic, Celtic, and Germanic. During 1878–1879 Saussure spent two semesters at the University of Berlin, enrolling in courses in Indic philology with Heinrich Zimmer (1851–1910) and Hermann Oldenberg (1854–1920). After barely six semesters of formal study of comparative-historical Indo-European linguistics Saussure, then just 21, published his major lifetime work. In this 300-page MeT moire sur le systeZ me primitif des oyelles dans les langues indo-europeT ennes (1879) Saussure assumed, on purely theoretical grounds, the existence of an early Proto-Indo-European sound of unknown phonetic value (designated *A) which would develop into various phonemes of the Indo-European vocalic system depending on its combination with those ‘sonantal coefficients.’ Saussure was thus able to explain a number of puzzling questions of IndoEuropean ablaut. However, the real proof of Saussure’s hypotheses came only many years later, after his death, following the decipherment of Hittite and its identification as an Indo-European language. In 1927 the Polish scholar Jerzy Kuryowicz (1895– 1978) pointed to Hittite cognates, i.e., related words corresponding to forms found in other Indo-European languages, that contained a laryngeal (not present in any of the other attested Indo-European languages) corresponding to Saussure’s ‘phoneZ me’ *A (Szemere! nyi 1973). What is significant in Saussure’s approach is his insistance on, and rigorous use of, the idea that the original Proto-Indo-European vowels form a coherent system of interrelated terms. Indeed, it is this emphasis on the systematic character of language which informs all of Saussure’s linguistic thinking to the extent that there are not, contrary to received opinion, two Saussures, the author of the MeT moire and the originator of the theories laid down in the Cours (cf. Koerner 1998). Having returned to Leipzig, Saussure defended his dissertation on the use of the genitive absolute in Sanskrit in February 1880, leaving for Geneva soon thereafter. Before he arrived in Paris in September 1880, he appears to have conducted fieldwork on Lithuanian, an Indo-European language of which documents reach back to the sixteenth century only, but which exhibits a rather conservative vowel system
comparable with that of Ancient Greek. First-hand exposure to this language was instrumental in his explanation of the Lithuanian system of accentuation (Saussure 1896) for which he is justly famous. In 1881, Michel Bre! al (1832–1915), the doyen of French linguistics, secured him a position as MaıV tre de ConfeT rences at the En cole des Hautes En tudes, a post he held until his departure for Geneva 10 years later. In Paris, Saussure found a number of receptive students, among them Antoine Meillet (1866–1936), Maurice Grammont (1866–1946), and Paul Passy (1859–1940), but also congenial colleagues such as Gaston Paris (1839–1903), Louis Havet (1849–1925), who had previously written the most detailed review of his MeT moire, and Arse' ne Darmesteter (1848–1888). Still, Saussure did not write any major work subsequent to his doctoral dissertation, but he wrote a series of frequently etymological papers, which illustrate his acumen in historical linguistics. It was through the posthumous publication of lectures on (in fact historical and) general linguistics that Saussure became known for his theoretical and nonhistorical views. In 1891, the University of Geneva offered Saussure a professorship of Sanskrit and Comparative Grammar, which was made into a regular chair of Comparative Philology in 1896. It was only late in 1906 that the Faculty added the subject of General Linguistics to his teaching load. It was this decision and Saussure’s three ensuing lecture series (1907, 1908–9, and 1910– 11) in which he developed his thoughts about the nature of language and the manner in which it was to be studied that eventually led to the epoch-making book he did not write, the Cours de linguistique geT neT rale. Saussure died on February 22, 1913 at Cha# teau Vufflens, Switzerland.
3. The Cours De Linguistique GeT neT rale The Cours appeared in 1916. By the 1920s Saussure’s name began to be almost exclusively connected with this posthumous work which was based largely on extensive lecture notes carefully taken down by a number of his students. One of them was Albert Riedlinger (1883–1978), whose name appears on the title page of the Cours as a collaborator. It was, however, put together by Saussure’s successors in Geneva, Charles Bally (1865–1947) and Albert Sechehaye (1870–1946), neither of whom had attended these lectures themselves as is frequently, but erroneously, stated in the literature. Indeed, their own focus of attention was nonhistorical linguistics, stylistics and syntax, respectively, and this had a considerable bearing on the manner in which Saussure’s ideas were presented (Amacker 2000), with Historical Linguistics, the subject Saussure was most interested in, being relegated to the end of the book. (See Godel 1957, for an analysis of the editors’ work; also Strozier 1988, for a close analysis of the texts.) It 13493
Saussure, Ferdinand de (1857–1913) was the long general introduction of the Cours and the part dealing with nonhistorical (‘synchronic’) linguistics, which made history.
3.1 Saussure’s\the Cours’s Legacy The ideas advanced in the Cours produced something of a revolution in linguistic science; historicalcomparative grammar which had dominated linguistic research since the early nineteenth century soon became a mere province of the field. At least in the manner the Cours had been presented by the editors, Saussure’s general theory of language was seen as assigning pride of place to the nonhistorical, descriptive, and ‘structural’ approach. (Saussure himself did not use the last-mentioned term in a technical sense.) This emphasis on the investigation of the current state of a language or languages led to a tremendous body of work concerned with the analysis of the linguistic system or systems of language and its function(s), and a concomitant neglect of questions of language change and the field of Historical Linguistics in general, a situation still very much characteristic of the current linguistic scene. However, the field has become stronger since the mid-1980s, as sociolinguistic and typological aspects took hold in the investigation of language change. From the 1920s onwards, notably outside of the traditional centers of Indo-European comparative linguistics, a variety of important schools of linguistic thought developed in Europe that can be traced back to proposals made in the Cours. These are usually identified with the respective centers from which they emanated, such as Geneva, Prague, Copenhagen, even London; more precisely these developments are to be associated with the names of Bally and Sechehaye, Roman Jakobson (1896–1982) and Nikolaj S. Trubezkoy (1890–1938), Louis Hjelmslev (1899– 1965), and John Rupert Firth (1890–1960), respectively. In North America too, through the work of Leonard Bloomfield (1887–1949), Saussure’s ideas became stock-in-trade among linguists, descriptivists, structuralists, and generativists (cf. Joseph 1990, for Saussure’s influence on Bloomfield as well as Chomsky). In each ‘school,’ it is safe to say, essential ingredients of the Cours were interpreted differently, at times in opposition to some of Saussure’s tenets as found in the book, which Saussure specialists now refer to as the ‘vulgata’ text, given that a number of points made in the Cours go back to its editors, not Saussure himself. However, it is this text that has made the impact on modern linguistics.
3.2 The Main Tenets of the Cours At the core of Saussure’s linguistic theory is the assumption that language is a system of interrelated 13494
terms, which he called ‘langue’ (in contradistinction to ‘parole,’ the individual speech act or speaking in general). This supra-individual ‘langue’ is the underlying code ensuring that people can speak and understand each other; the language-system thus has a social underpinning. At the same time, ‘langue’ is an operative system embedded in the brain of everyone who has learned a given language. The analysis of this system and its functioning, Saussure maintains, is the central object of linguistics. His characterization of ‘langue’ as a ‘fait social’ has often led to the belief that Saussure’s thinking is indebted to E; mile Durkheim’s (1858–1917) sociological framework. While the two were close contemporaries and shared much of the same intellectual climate, no direct influence of the latter on the former can be demonstrated; Meillet, Saussure’s former student and a collaborator of Durkheim’s since the late 1890s, publicly denied it when it was first proposed in 1931. For Saussure the social bond between speakers sharing the same language (‘langue’) was constitutive for the operation of this unique semiological system (see below). The language system is a network of relationships which Saussure characterized as being of two kinds: ‘syntagmatic’ (i.e., items are arranged in a consecutive, linear order) and ‘associative,’ later on termed (by Firth and by Hjelmslev) ‘paradigmatic’ (i.e., pertaining to the organization of units in a deeper, not directly observable fashion dealing with grammatical and semantic relations). Since it is only in a state (‘eT tat de langue’) that this system can be revealed, the nonhistorical, ‘synchronic’ approach to language must take pride of place. Only after two such language states of different periods in the development of a given language have been properly described can the effects of language change be calculated, i.e., ‘diachronic,’ historical linguistics be conducted. Hence the methodological, if not epistemological primacy of synchrony over diachrony. Apart from syntagmatic vs. paradigmatic relations, several trichotomies can be found in the Cours which, however are usually reduced to dichotomies. Many of them have become current in twentieth-century thought, far beyond their original application, i.e., language–langue–parole (i.e., language in all its manifestations or ‘speech’; language as the underlying system, and ‘speaking,’ with terms such as ‘tongue’ and ‘discourse’ or ‘competence’ and ‘performance’ being proposed to replace the langue\parole couple), signe–signifieT –signifiant (sign, signified, and signifier), synchrony vs. diachrony (Saussure’s ‘panchrony’ would be an overarching of these two perspectives). Saussure’s definition of language as ‘a system of (arbitrary) signs’ and his proposal of linguistics as the central part of an overall science of sign relations or ‘se! miologie’ have led to the development of a field of inquiry more frequently called (following Charles Sanders Peirce’s [1839–1914] terminology) ‘semiotics,’ which more often than not deals with sign systems
Saage, Leonard J (1917–71) other than those pertaining to language, such as literary texts, visual art, music, and architecture. (For the wider implications of Saussure’s socio-semiotic ideas and a critique of the various uses of Saussure’s concepts in literary theory, see Thibault 1997.) As is common with influential works (cf., e.g., Freud’s) many ingredients of Saussure’s general theory of language have often been taken out of their original context and incorporated into theories outside their intended application, usually selectively and quite arbitrarily, especially in works by French writers engaged in ‘structural’ anthropology (e.g., Claude Le! vi-Strauss) and Marxist philosophy (e.g., Louis Althusser), literary theory (e.g., Jacques Derrida), psychoanalysis (e.g., Jacques Lacan), and semiotics (e.g., Roland Barthes), and their various associates and followers. (For a judicious critique of these extralinguistic exploitations, see Tallis 1988.) However, these uses—and abuses—demonstrate the endurance and originality of Saussure’s ideas. He has achieved in linguistics a status comparable to Imanuel Kant in philosophy, in that we can, similar to Kant’s place in the history of thought, distinguish between a linguistics before Saussure and a linguistics after Saussure.
Bibliography Amacker R 2000 Le de! veloppment des ide! es saussurennes chez Charles Bally et Albert Sechehay. Historigraphia Linguistica 27: 205–64 Bouquet S 1997 Introduction aZ la lecture de Saussure. Payot, Paris Engler R 1976 Bibliographie saussurienne [1970–]. Cahiers Ferdinand de Saussure 30: 99–138, 31: 279–306, 33: 79–145, 40: 131–200, 43: 149–275, 50: 247–95 (1976, 1977, 1979, 1986, 1989, 1997). Godel R 1957 Les Sources manuscrites du Cours de linguistique geT neT rale de F. de Saussure. Droz, Geneva, Switzerland Harris R 1987 Reading Saussure: A Critical Commentary on the Cours de linguistique geT neT rale. Duckworth, London Holdcroft D 1991 Saussure: Signs, System, and Arbitrariness. Cambridge University Press, Cambridge, UK Joseph J E 1990 Ideologizing Saussure: Bloomfield’s and Chomsky’s readings of the Cours de linguistique ge! ne! rale. In: Joseph J E, Taylor T E (eds.) Ideologies of Language. Routledge, London and New York, pp. 51–93 Koerner E F K 1972 Bibliographia Saussureana, 1870–1970: An Annotated, Classified Bibliography on the Background, Deelopment and Actual Releance of Ferdinand De Saussure’s General Theory of Language. Scarecrow Press, Metuchen, NJ Koerner E F K 1973 Ferdinand de Saussure: Origin and Deelopment of his Linguistic Thought in Western Studies of Language. A Contribution to the History and Theory of Linguistics. F. Vieweg and Sohn, Braunschweig, Germany Koerner E F K 1988 Saussurean Studies\En tudes saussuriennes. Aant-propos de R. Engler. Slatkine, Geneva, Switzerland Koerner E F K 1998 Noch einmal on the History of the Concept of Language as a ‘syste' me ou' tout se tient’. Cahiers Ferdinand de Saussure 51: 21–40
Saussure F de 1879 Me! moire sur le syste' me primitif des voyelles dans les langues indo-europe! ennes. B. G. Teubner, Leipzig, Germany (Repr. G. Olms, Hildesheim, 1968.). In: Lehmann W P (ed.) A Reader in Nineteenth-Century Historical IndoEuropean Linguistics. Indiana University Press, Bloomington and London, 1967, pp. 218–24 Saussure F de 1896 Accentuation lituanienne. Indogermanische Forschungen. Anzeiger 6: 157–66 Saussure F de 1916 Cours de linguistique ge! ne! rale ed. by Charles Bally and Albert Sechehaye, with the collaboration of Albert Riedlinger. Payot, Lausanne and Paris. (2nd ed., Paris: Payot, 1922; 3rd and last corrected ed., 1931; 4th ed., 1949; 5th ed., 1960, etc.—English transl.: (1) By Wade Baskin, Course in General Linguistics, Philosophical Library, London and New York, 1959 (repr., New York: McGraw-Hill, 1966; rev. ed., Collins\Fontana, London, 1974), and (2) by Roy Harris, Course in General Linguistics, Duckworth, London, 1983. Saussure F de 1922 Recueil des publications scientifiques ed. by Charles Bally and Le! opold Gautier. Payot, Lausanne; C. Winter, Heidelberg. (Repr. Slatkine, Geneva, 1970; it includes a reprint of the MeT moire, his 1880 dissertation (pp. 269–338), and all of Saussure’s papers published during his lifetime.) Saussure F de 1957 Cours de linguistique ge! ne! rale (1908–1909): Introduction. Cahiers Ferdinand de Saussure 15: 6–103 Saussure F de 1967–1968, 1974 Cours de linguistique geT neT rale. En dition critique par Rudolf Engler, 4 fasc. Otto Harrassowitz, Wiesbaden, Germany Saussure F de 1972 Cours de linguistique geT neT rale. Payot, Paris Saussure F de 1978 (1872) Essai pour re! duire les mots du grec, du latin et de l’allemand a' un petit nombre de racines. Cahiers Ferdinand de Saussure 32: 77–101 Strozier R M 1988 Saussure, Derrida, and the Metaphysics of Subjectiity. Mouton de Gruyter, Berlin and New York Szemere! nyi O 1973 La the! orie des laryngales de Saussure a' Kuryowicz et a' Benveniste: Essai de re! e! valuation. Bulletin de la SocieT teT de Linguistique de Paris 68: 1–25 Tallis R 1988 Not Saussure. Macmillan, London Thibault P J 1997 Re-Reading Saussure: The Dynamics of Signs in Social Life. Routledge, London
E. F. K. Koerner
Savage, Leonard J (1917–71) L. J. Savage, always addressed by friends as Jimmie, was born in Detroit, Michigan on 20 November 1917. His father was in the real-estate business, his mother a nurse. Throughout his life he suffered from poor eyesight, a combination of nystagmus and extreme myopia, which was to affect the development of his career in many ways. Most of us accept reading as an easy, even casual, activity, whereas to him it was a serious business with the text held close to the eye. The result was that he appeared to regard the material as important and would absorb it with an intensity that often escapes those with normal vision. To have him read something you had written was, at first, an uncomfortable business, for he would politely question much of the material, but later to realize that he was being constructive and the final result benefited 13495
Saage, Leonard J (1917–71) enormously from his study of it. He was a superb lecturer, once remarking that he could hold himself spell-bound for an hour. When, towards the end of his life, he gave the Fisher lecture, he held the audience spell-bound for considerably more than the allotted hour. The poor eyesight at first affected him adversely, his high-school teacher not recommending him for further education. His parents were insistent and he went initially to Wayne University and then to the University of Michigan. His attempts to do biology failed because he could not draw, and chemistry was ruled out when he dropped the beaker in the laboratory. Eventually he met a fine mathematics teacher who introduced him to a field in which the reading had to be intense, yet minimal, where his brilliance was recognized and he gained a Ph.D in the application of vectorial methods to metric geometry. After a year each at Princeton, Cornell, and Brown, 1944 found him a member of the Statistical Research Group at Columbia University, a group he described as ‘one of the greatest hotbeds statistics has ever had,’ a judgement that is surely sound, as almost all those who were to become the leading statisticians in the US of the 1950s were in the group or associated with it. So he became a statistician, and although his greatest work was in the theory, as such he was able to indulge his interest in science, so that throughout his career he enjoyed, and was extremely good at, consulting with his scientific colleagues. It was not just science that interested him, for he had inherited from his father a respect for business, and had acquired from Milton Friedman, a member of the group, an appreciation of economics, that made his advice useful and generously available to all serious thinkers. In 1946, Savage went to the University of Chicago where, from 1949, he was in the Statistics department founded by Allen Wallis, becoming chairman from 1957 until he left the university in 1960. His departure was a turning point in his life. A reason for leaving was personal difficulties in his marriage to Jane Kretschmer that they hoped could be mended by a move to Michigan with their two sons, Sam and Frank. It did not work out and they were divorced in 1964. That year he married Jean Pearce and moved to Yale. It was in New Haven on 1 November 1971 that he tragically died at the early age of 53, and statistics lost a great scholar and many of us a dear friend. The move from Chicago was partly influenced by his perception of his relations with colleagues there. He felt, as we will see below, that he had produced a sound axiomatization of statistics showing that many standard statistical procedures were unsatisfactory, so that his colleagues should either explain where the axioms were wrong, or else abandon the faulty procedures. They did neither. They were not alone in this; most statisticians acted as the Chicago faculty, refusing to contest the axioms, yet refusing to use the results (many still do so today) and 13496
Savage found it difficult to get a suitable post until Yale obliged. The monograph by Savage with others (1962), although now rather dated, indicates the sort of antagonism that existed, though, in that book, all the protagonists are on their best behavior. Savage (1981) contains most of his more important papers, tributes from others and a technical summary of his work. Many of his papers, contained therein, deal, not with technicalities, but with his growing appreciation of statistical ideas as they developed throughout his life. Savage is justifiably famous for one tremendous contribution to statistics and the scientific method, contained in his 1954 book The Foundations of Statistics. Or, to be more correct, the first seven chapters of that book. Anyone contemplating study of this glorious achievement should first read his preface to the second edition in 1972, for this is one of the most honest pieces of scientific writing that I know. To appreciate the importance of the book, it is necessary to cast one’s mind back to the state of statistics around 1950. Fisher made enormous advances from 1920 onwards. Neyman and Pearson had introduced new ideas and Wald was developing decision analysis. But the statistical procedures available, though undoubtedly valuable and much appreciated by scientists, especially biologists, were a disparate collection of ideas lacking coherence. Savage had the idea that statistics should be like other branches of mathematics and be based on a set of concepts, termed axioms, from which the procedures would follow as theorems. He wanted to provide a firm basis for Fisher’s results. Another way of understanding is to recognize that Fisher had suggested procedures whose properties had been studied by him and others, whereas Savage turned the method around and asked what properties we wanted and discovered what procedures provided them. His expectation was that they would prove to be just what statisticians were using. To his great surprise, they were not. Interestingly, his last paper, Savage et al. (1976), is a brilliant, sympathetic review of Fisher’s work. Savage had worked with von Neumann in Princeton and had much appreciated his work on games with Morgenstern (1944), that introduced axioms leading to the concept of a utility function for the outcomes of a game. However, their treatment had used probability without justification, so Savage conceived the idea of developing an axiomatic system for decision-making under uncertainty which would not use probability in the axioms but derive it in theorems, so uniting the ideas of probability and utility. Also influential was the development by Wald (1950) of an attempt to unite statistical ideas under the concept of decision-making, using von Neumann’s idea of minimax, but lacking any axiomatic foundation. Savage’s 1954 book presents axioms for a single decision-maker, now often referred to as ‘you,’ faced with selecting a course of action in the face of uncertainty. He then proves three
Saage, Leonard J (1917–71) theorems: first, that your various uncertainties present must combine according to the rules of probability; second, that your perceptions of the merits of the possible outcomes that result from your actions must be described by a real-valued utility function (which is itself based on probability); third, that your optimum decision is that which maximizes your expected utility (MEU), the expectation being evaluated according to the probabilities developed in the first part of this trilogy. One of the axioms he used encapsulated a simple notion that has since been seen to be of some importance. If you prefer A to B when C is true and, at the same time, prefer A to B when C is false; then you prefer A to B when you do not know whether or not C is true. It is called the ‘sure-thing’ principle because the preference of A over B is sure, in American parlance, at least as far as C is concerned. The introduction of MEU was not new, it having been used by Daniel Bernoulli in the eighteenth century, but the development extended its use considerably, in particular by showing that it was the only sensible method for a single decision-maker to use. Utility had recently been explained, as we have seen, by von Neumann. The original and dramatically important result was the first, saying that probability was the only sensible description for uncertainty, so that the rules of probability were not arbitrary but dictated by the sensible and modest requirements expressed in the axioms. Others, from Laplace onwards, had used probability without justification: Savage showed that it was inevitable. So it was the extensive and unique use of probability that was Savage’s main contribution in the book, and there are two ways in which his analysis was influential. First was the fact that the probability was personal, expressing the uncertainty of the decision-maker, ‘you.’ It was not the probability, but rather your probability. An immediate reaction is that this contradicts the very nature of the scientific method which is supposed to yield objective results. This apparent contradiction was resolved in a later paper with Edwards and Lindeman (Savage et al. 1963), which showed in the principle of stable estimation, that under general conditions, people with different uncertainties would reach agreement on receipt of enough data. This result agrees with the observation that scientists typically disagree with one another in the early stages of an investigation but are eventually brought together as the evidence accumulates. Many now think that the reduction of uncertainty, through data, incorporated in the laws of probability, expresses the nature of induction and is part of the scientific method. A second, influential factor was the result that probability was the only sensible description of uncertainty, for scientists, encouraged by statisticians, had been using other methods, for example in the use of tail-area significance tests of a null hypothesis, H, where the relevant quantity is the probability in the tail
of a distribution, given that H is true. Savage’s argument was that the proper evaluation is the probability of H, given the data, not of an aspect of the data, given H. Similar contradictions arise with confidence intervals for a parameter which involve probability statements about the interval, not about the parameter. These are instances where his conclusions clashed with standard practice, leading to disagreements like that at Chicago mentioned above. There was a third way in which Savage’s ideas contradicted the current practice, though this was not fully realized until Birnbaum’s (1962) paper. The axiomatic development showed that if you had a statistical model with probability p(x\θ) for data x given parameter θ, then the only way the data could affect your uncertainty of θ, given x, was through the likelihood function p(x\θ) expressing how that probability varied with θ for the fixed, observed x. This is the likelihood principle. Significance tests clearly violate the principle since they use a tail-area which involves integration of p(x\θ) over x-values, that is, over data that you do not have. Indeed, many statistical procedures, even today, violate the principle. A related example of conflict with current practice is found in optional stopping. It had long been known that, under commonly occurring circumstances, it is possible to continue sampling until a null hypothesis is rejected at any preassigned significance level, so that you can opt to stop at your convenience. Savage showed that it was the significance test that was at fault, not the stopping procedure. It is an astonishing fact, that he explains in the preface to the second edition, that he and others who worked with him, including myself, in the mid-1950s, did not appreciate these ideas and did not comprehend that, rather than his work justifying statistical practice, it altered it. The latter part of his book in which the justification is attempted, largely using the tool of minimax that von Neumann had employed with great effect in the theory of games, is today of little interest, in contrast to the brilliance of the early chapters. Savage was a true scholar, a man who studied the work of others and accepted any valid criticism of his work, to incorporate their ideas with due acknowledgment. And so he understood the ideas of Ramsey (1926) who had, in a less rigorous presentation, developed similar ideas, ideas which no one had understood until Savage. But more importantly, he appreciated the work of the Italian, de Finetti, who, from an entirely different standpoint, and one which is perhaps more forceful than Savage’s, developed the concept and uniqueness of personal probability. (For English readers, the best references are de Finetti (1972, 1974, 1975), though his ideas originate in the 1930s.) Savage acquired skill in Italian and worked with him. One of the problems de Finetti had studied was your assessment of probability. You are uncertain whether it will rain tomorrow in your city; so according to their thesis, you have a probability for rain. 13497
Saage, Leonard J (1917–71) How are you to assess its value? De Finetti suggested a scoring rule: if you choose p as your value, you will receive a penalty score (1kp)# if it rains and p# if not. Scores for different events will be added. To minimize your total score the values that you give must be probabilities. The basic paper was written by Savage (1972) with due acknowledgment to the originator. With Hewitt (1955), Savage also extended significantly another result of de Finetti on exchangeability. Savage conducted a major study with Dubins that resulted in their book (Savage and Dubins 1965). The problem here is that you are in an unfavorable gaming situation at a casino and your object is to maximize your chance of winning a fixed sum of money; how should you play? They showed that boldness pays; bet all you have or the maximum amount the casino will allow. This led to a thorough study of betting strategies and many interesting and useful results were obtained. For reasons that are not entirely clear, the school of statistics that rests on Savage’s results is called Bayesian, after the eighteenth century cleric, Thomas Bayes partly because much use is made of Bayes’s theorem. Today it is a flourishing school of statistics with a large and increasing number of adherents, that co-exists alongside the earlier school, often called frequentist because probability therein refers to frequency, whereas to a Bayesian it expresses your uncertainty of an event. Bayesians believe that their approach to the handling of data, and the making of decisions that use the data, is important for social and behavioral sciences for the following reasons. If these subjects are to be truly scientific, their arguments must depend heavily on data which, properly interpreted, can be used to defend theories in the interplay of theory and practice that is the scientific method. In the physical sciences, the data ordinarily take the form of results of planned experiments in the field or laboratory, experiments which can, and should be, repeated by others, in accord with stable estimation, their repetition connecting naturally with frequency. In the social sciences, such planning and repetition is rarely possible, so that reliance has to be placed on observational data, and the behavioral sciences are scarcely better off. It is now widely appreciated that observational data have to be analyzed with considerable care, so that any conclusions are soundly based and not confounded with other factors that the lack of planning has been unable to control. Statistics therefore becomes a more important ingredient in the social, than the physical, sciences. Furthermore, social decisions must often be made in the face of uncertainty, whereas more time and effort by an experimental scientist can substantially reduce, if not eliminate, the uncertainty. The Bayesian approach recognizes and incorporates all of these factors. There is another sense in which the Bayesian approach may be valuable and that lies in its ability to express uncertainty about any aspect of a study, not confining probability to the repetitious, frequency 13498
aspects. Often the conclusions in the social and behavioral sciences are tentative and the personalistic perception of probability, as a measure of your belief in a hypothesis, can provide an expression of uncertainty that can be conveyed to others. We still have a long way to go in employing these ideas, when even a weather-forecaster is reluctant to say there is a probability of 0.8 that it will rain tomorrow, or, if he does, sometimes does not appreciate the meaning of what is being said. Savage had been described as ‘the Euclid of statistics’ in that he did for statistics what Euclid did for geometry, laying down a set of axioms from which, as theorems, the only sensible statistical procedures emerge. As with Euclid, other geometries will emerge, so other statistical systems may arise, but, for the moment, Savage’s Bayesian approach is the only systematic one available and appears to work. He was, in the sense of Kuhn, a true revolutionary, who overturned one paradigm, replacing it by another without, at first, realizing what he had done. See also: Bayesian Statistics; Bayesian Theory: History of Applications; Decision Theory: Bayesian; Fisher, Ronald A (1890–1962); Hotelling, Harold (1895– 1973); Probability: Interpretations; Risk: Theories of Decision and Choice; Statistical Methods, History of: Pre-1900; Statistics, History of; Tversky, Amos (1937–96)
Bibliography Birnbaum A 1962 On the foundations of statistical inference. Journal of the American Statistical Association 57: 269–306 de Finetti B 1972 Probability, Induction and Statistics: The Art of Guessing. Wiley, London de Finetti B 1974\5 Theory of Probability, 2 Vols. Wiley, London Dubins L E, Savage L J 1965 How to Gamble if You Must: Inequalities for Stochastic Processes. McGraw-Hill, New York Ramsey F P 1926 Truth and probability. In: Braithwaite R B (ed.) The Foundations of Mathematics and Other Logical Essays. Routledge and Kegan Paul, London Savage L J 1954 The Foundations of Statistics. Wiley, New York Savage L J 1971 Elicitation of personal probabilities and expectations. Journal of the American Statistical Association 66: 783–801 Savage L J 1972 The Foundations of Statistics, 2nd edn. Dover, New York Savage L J et al. 1976 On re-reading R A Fisher. Annals of Statistics 4: 441–500 Savage L J 1981 The Writings of Leonard Jimmie Saage—A Memorial Selection. American Statistical Association and Institute of Mathematics, Washington Savage L J, Edwards W, Lindman H 1963 Bayesian statistical inference for psychological research. Psychological Reiews 70: 193–242 Savage L J, Hewitt E 1955 Symmetric measures on Cartesian products. Transactions of the American Mathematical Society 80: 470–501
Saings Behaior: Demographic Influences Savage L J et al. 1962 The Foundations of Statistical Inference: A Discussion. Methuen, London von Neumann J, Morgenstern O 1944 Theory of Games and Economic Behaior. Princeton University Press, Princeton, NJ Wald A 1950 Statistical Decision Functions. Wiley, New York
D. V. Lindley
Savings Behavior: Demographic Influences In any period all economic output is either consumed or saved. By consuming, individuals satisfy their current material needs. By saving, individuals accumulate wealth that serves a multitude of purposes. When invested or loaned, wealth yields a stream of income to the holder. It is a means of providing for future material needs, for example, during retirement. Wealth provides partial protection against many of life’s uncertainties, including the loss of a job or unexpected medical needs. Wealth can be passed on to future generations out of feelings of altruism, or used as a carrot to encourage desired behavior among those who hope for a bequest. Wealth provides status to the holder. Because of the varied purposes served by saving, many behavioral models have been proposed and a variety of influencing factors have been identified. Demographic factors, including age structure, fertility, and mortality, have been found to influence saving in many studies. Additional factors that have been identified include the level of per capita income, economic growth rates, interest rates, characteristics of the financial system, fiscal policy, uncertainty, and public pension programs.
1. Saing Trends and their Importance In the 1960s, low saving rates in the developing world were a serious impediment to economic development. The industrialized countries had much higher rates of saving, but international capital flows from the industrialized to the developing world were insufficient to finance needed investment in infrastructure and industrial enterprise. Since that time, the developing countries have followed divergent paths. Saving rates in sub-Saharan Africa have remained low. After increasing somewhat in the 1970s, saving rates in Latin America declined during the late 1970s and early 1980s. In contrast, South Asian and especially East Asian saving rates have increased substantially and currently are well above saving rates found in the industrialized countries. The emergence of high saving rates is widely believed to bear major responsibility for Asia’s rapid economic growth up until the mid-1990s.
The industrialized countries face their own saving issue. Since the mid-1970s, they have experienced a gradual, but steady, decline in saving rates. US saving rates have reached especially low levels. Current US household saving rates are near zero. The low rates of saving raise two concerns: first, that economic growth will be unsustainable and, second, that current generations of workers will face substantially reduced standards of living when they retire.
2. General Saing Concepts The national saving rate is the outcome of decisions by three sets of actors—governments, firms, and households—but analyses of saving rates are overwhelmingly based on household behavioral models. The reliance on household models is justified on two grounds. First, decisions by households may fully incorporate the decisions made by firms and governments. Firms are owned by households. When firms accumulate wealth, the wealth of households increases as well. Thus, from the perspective of the household, saving by firms and saving by households are close substitutes. Consequently, household behavior determines the private saving rate, the combined saving of firms and households. Government saving also affects the wealth of the household sector by affecting current and future taxes. By issuing debt governments can, in principle, increase consumption and reduce national saving at the expense of future generations. However, households may choose to compensate future generations (their children) by increasing their saving and planned bequests. If they do so, national saving is determined entirely by the household sector and is independent of government saving (Barro 1974). Second, firms and governments may act as agents for households. Their saving on behalf of households may be influenced by the same factors that influence household behavior. The clearest example of this type of behavior is when firms or governments accumulate pension funds on behalf of their workers or citizens. Households are motivated to save for a number of reasons, but current research emphasizes three motives: insurance, bequests, and lifecycle motives. Households may accumulate wealth to insure themselves against uncertain events, e.g., the loss of a crop or a job, the death of a spouse, or an unanticipated medical expense. Households may save in order to accumulate an estate intended for their descendents. Household saving may be motivated by lifecycle concerns, the divergence between the household’s earnings profile and its preferred consumption profile. The importance of these and other motives in determining national saving rates is a matter of vigorous debate among economists. Proponents of the bequest motive have estimated that as much as 80 percent of US national saving is accounted for by the bequest motive (Kotlikoff 1988). Proponents of the 13499
Saings Behaior: Demographic Influences lifecycle motive have estimated that an equally large portion of US national saving is accounted for by the lifecycle motive (Modigliani 1988). Clearly, the way in which changing demographic conditions impinge on national saving rates will depend on the importance of these different motives.
3. The Lifecycle Model Both the larger debate on the determinants of saving and explorations of the impact on saving of demographic factors have been framed by the lifecycle model (Modigliani and Brumberg 1954). The key idea that motivates the model is the observation that for extended portions of our lives we are incapable of providing for our own material needs. Thus, economic resources must be reallocated from economically productive individuals concentrated at the working ages to dependents concentrated at young or old ages. Several mechanisms exist for achieving this reallocation. In traditional societies, the family plays a dominant role. Typically, the young, the old, and those of working age live in extended families supported by productive family members. In modern societies, solving the lifecycle problem is a shared responsibility. Although the young continue to rely primarily on the family, the elderly rely on the family, on transfers from workers effected by the government, and on personnel wealth they have accumulated during their working years (Lee 2000). The lifecycle saving model is most clearly relevant to higher income settings where capital markets have developed, family support systems have eroded, and workers anticipate extended periods of retirement. Under these conditions, the lifecycle model implies that households consisting of young, working age adults will save, while household consisting of old, retired adults will dis-save. The national saving rate is determined, in part, by the relative size of these demographic groups. A young age structure yields high saving rates; an old age structure yields low saving rates. Likewise, a rise in the rate of population growth leads to higher saving rates because of the shift to a younger age structure. In the standard formulation of the lifecycle model, changes in the growth rate of per capita income operate in exactly the same way as changes in the population growth rate. Given higher long-run rates of economic growth, young adults have greater lifetime earnings than older adults. They have a correspondingly greater impact on the aggregate saving rate because of their control of a larger share of economic resources. Consequently, an increase in either the population growth rate or the per capita income growth rate leads to higher saving. The rate of growth effect is one of the most important and widely tested implications of the lifecycle saving model. With a great deal of consistency, 13500
empirical studies have found that an increase in the rate of per capita income growth leads to an increase in the national saving rate. Empirical research does not, however, support the existence of a positive population rate of growth effect (Deaton 1989). The standard lifecycle model provides no basis for reconciling the divergent rate of growth effects. The impact of child dependency on household saving provides one possible explanation of why population growth and economic growth need not have the same effect on aggregate saving. Coale and Hoover (1958) were the first to point out the potentially important impact of child dependency. They hypothesized that ‘A family with the same total income but with a larger number of children would surely tend to consume more and save less, other things being equal’ (p. 25). This raised the possibility that saving follows an inverted-U shaped curve over the demographic transition. In low-income countries, with rapid population growth rates and young age structure, slower population growth would lead to higher saving. But in higher income countries that were further along in their demographic transitions, slower population growth would lead to a population concentrated at older, low saving ages as hypothesized in the lifecycle model. These contrasting demographic effects have been modeled in the empirical literature using the youth and old-age dependency ratios. The variable lifecycle model incorporates the role of child dependency by allowing demographic factors to influence both the age structure of the population and the age profiles of consumption, earning, and, hence, saving. In this version of the lifecycle model the size of the rate of growth effect varies depending among other things on the number or cost of children. As with the dependency ratio model, saving follows an inverted-U shaped path over the demographic transition. However, the impact of demographic factors varies with the rate of economic growth. The model implies that in countries with rapid economic growth, e.g., East Asia, demographic factors will have a large impact on saving, but in countries with slow economic growth, e.g., Africa, demographic factors would have a more modest effect (Mason 1987). Empirical studies of population and saving come in two forms. Most frequently, researchers have based their analyses on aggregate time series data now available for many countries. Recent estimates support the existence of large demographic effects on saving. Analysis by Kelley and Schmidt (1996) supports the variable lifecycle model. Higgins and Williamson (1997) find that demographic factors influence saving independently of the rate of economic growth. An alternative approach relies on microeconomic data to construct an age-saving profile. The impact of age structure is then assessed assuming that the age profile does not change. In the few applications undertaken, changes in age structure had an impact on saving that is more modest than found in analyses of
Scale in Geography aggregate saving data. (See Deaton and Paxson (1997) for an example.) Until these approaches are reconciled, a firm consensus about the magnitude of demographic effects is unlikely to emerge.
4. Unresoled Issues There are a number of important issues that have not been resolved and require additional work. First, the saving literature does not yet adequately incorporate the impact of changing institutional arrangements. Studies of saving in the industrialized countries and recent work on developing countries considers the impact of state-sponsored pension programs on saving, but the impact of family support systems has received far too little emphasis. The erosion of the extended family in developing countries is surely one of the factors that has contributed to the rise of saving rates observed in many developing countries. Second, the role of mortality has received inadequate attention. The importance of lifecycle saving depends on the expected duration of retirement. In high mortality societies, few reach old age and many who do continue to work. Only late in the mortality transition, when there are substantial gains in the years lived late in life, does an important pension motive emerge. Third, the saving models currently in use are static models and do not capture important dynamics. Recent simulation work that combines realistic demographics with lifecycle saving behavior shows that during the demographic transition countries may experience saving rates that substantially exceed equilibrium values for sustained periods of time (Lee 2000).
Bibliography Barro R J 1974 Are government bonds net wealth? Journal of Political Economy 6 (December): 1095–117 Coale C A, Hoover E M 1958 Population Growth and Economic Deelopment in Low-income Countries: A Case Study of India’s Prospects. Princeton University Press, Princeton, NJ Deaton A1989 Saving in developing countries: Theory and review. Proceedings of the World Bank Annual Conference on Deelopment Economics, Supplement to the World Bank Economic Review and the World Bank Research Observer, pp. 61–96 Deaton A, Paxson C 1997 The effects of economic and population growth on national saving and inequality. Demography 34(1): 97–114 Higgins M, Williamson J G 1997 Age structure dynamics in Asia and dependence on foreign capital. Population and Deelopment Reiew 23(2): 261–94 Kelley A C, Schmidt R M 1996 Saving, dependency and development. Journal of Population Economics 9(4): 365–86
Kotlikoff L J 1988 Intergenerational transfers and savings. Journal of Economic Perspecties 2(2): 41–58 Lee R D 2000 Intergenerational transfers and the economic life cycle: A cross-cultural perspective. In: Mason A, Tapinos G (eds.) Sharing the Wealth: Demographic Change and Economic Transfers Between Generations. Oxford University Press, Oxford, UK Lee R D, Mason A, Miller T 2000 Life cycle saving and the demographic transition: The case of Taiwan. In: Chu C Y, Lee R D (eds.) Population and Economic Change in East Asia, Population and Deelopment Reiew. 26: 194–219 Mason A 1987 National saving rates and population growth: A new model and new evidence. In: Johnson D G, Lee R D (eds.) Population Growth and Economic Deelopment: Issues and Eidence. University of Wisconsin Press, Madison, WI, pp. 5230–60 Modigliani F 1988 The role of intergenerational transfers and life cycle saving in the accumulation of wealth. Journal of Economic Perspecties 2(2): 15–40 Modigliani F, Brumberg R 1954 Utility analysis and the consumption function: An interpretation of cross-section data. In: Kurihara K (ed.) Post-Keynesian Economics. Rutgers University Press, New Brunswick, NJ
A. Mason
Scale in Geography Scale is about size, either relative or absolute, and involves a fundamental set of issues in geography. Scale primarily concerns space in geography, and this article will focus on spatial scale. However, the domains of temporal and thematic scale are also important to geographers. Temporal scale deals with the size of time units, thematic scale with the grouping of entities or attributes such as people or weather variables. Whether spatial, temporal, or thematic, scale in fact has several meanings in geography.
1. Three Meanings of Scale The concept of scale can be confusing, insofar as it has multiple referents. Cartographic scale refers to the depicted size of a feature on a map relative to its actual size in the world. Analysis scale refers to the size of the unit at which some problem is analyzed, such as at the county or state level. Phenomenon scale refers to the size at which human or physical earth structures or processes exist, regardless of how they are studied or represented. Although the three referents of scale frequently are treated independently, they are in fact interrelated in important ways that are relevant to all geographers, and the focus of research for some. For example, choices concerning the scale at which a map 13501
Scale in Geography should be made depend in part on the scale at which measurements of earth features are made and the scale at which a phenomenon of interest actually exists.
1.1 Cartographic Scale Maps are smaller than the part of the earth’s surface they depict. Cartographic scale expresses this relationship, traditionally in one of three ways. A verbal scale statement expresses the amount of distance on the map that represents a particular distance on the earth’s surface in words, e.g., ‘one inch equals a mile.’ The representative fraction (RF) expresses scale as a numerical ratio of map distance to earth distance, e.g., ‘1:63,360.’ The RF has the advantage of being a unitless measure. Finally, a graphic scale bar uses a line of particular length drawn on the map and annotated to show how much earth distance it represents. A graphic scale bar has the advantage that it changes size appropriately when the map is enlarged or reduced. Alternatively, all three expressions of scale may refer to areal measurements rather than linear measurements, e.g., a 1-inch square may represent 1 square mile on the earth. Given a map of fixed size, as the size of the represented earth surface gets larger, the RF gets smaller (i.e., the denominator of the RF becomes a larger number). Hence, a ‘large-scale map’ shows a relatively small area of the earth, such as a county or city, and a ‘small-scale map’ shows a relatively large area, such as a continent or a hemisphere of the earth. This cartographic scale terminology is frequently felt to be counterintuitive when applied to analysis or phenomenon scale, where small-scale and large-scale usually refer to small and large entities, respectively. An important complexity about cartographic scale is that flat maps invariably distort spatial relations on the earth’s surface: distance, direction, shape, and\or area. How they distort these relations is part of the topic of map projections. In many projections, especially small-scale maps that show large parts of the earth, this distortion is extreme so that linear or areal scale on one part of the map is very different than on other parts. Even so-called equal area projections maintain equivalent areal scale only for particular global features, and not for all features at all places on the map. Variable scale is sometimes shown on a map by the use of a special symbol or multiple symbols at different locations.
1.2 Analysis Scale Analysis scale includes the size of the units in which phenomena are measured and the size of the units into which measurements are aggregated for data analysis and mapping. It is essentially the scale of understanding of geographic phenomena. Terms such as 13502
‘resolution’ or ‘granularity’ are often used as synonyms for the scale of analysis, particularly when geographers work with digital representations of the earth’s surface in a computer by means of a regular grid of small cells in a satellite image (rasters) or on a computer screen (pixels). Analysis scale here refers to the area of earth surface represented by a single cell. It has long been recognized that in order to observe and study a phenomenon most accurately, the scale of analysis must match the actual scale of the phenomenon. This is true for all three domains of scale— spatial, temporal, and thematic. Identifying the correct scale of phenomena is, thus, a central problem for geographers. Particularly when talking about thematic scale, using data at one scale to make inferences about phenomena at other scales is known as the cross-level fallacy (the narrower case of using aggregated data to make inferences about disaggregated data is wellknown as the ecological fallacy). Geographers often analyze phenomena at what might be called ‘available scale,’ the units that are present in available data. Many problems of analysis scale arise from this practice, but it is unavoidable given the difficulty and expense involved in collecting many types of data over large parts of the earth’s surface. Geographers have little choice in some cases but to analyze phenomena with secondary data, data collected by others not specifically for the purposes of a particular analysis. For example, census bureaus in many countries provide a wealth of data on many social, demographic, and economic characteristics of their populace. Frequently, the phenomenon of interest does not operate according to the boundaries of existing administrative or political units in the data, which after all were not created to serve the needs of geographic analysis. The resolution of image scanners on remote-sensing satellites provides another important example. Landsat imagery is derived from thematic mapper sensors, producing earth measurements at a resolution of about 30 by 30 meters. However, many phenomena occur at finer resolutions than these data can provide. Most useful is theory about the scale of a phenomenon’s existence. Frequently lacking this, but realizing that the available scale may not be suitable, geographers use empirical ‘trial-and-error’ approaches to try to identify the appropriate scale at which a phenomenon should be analyzed. Given spatial units of a particular size, one can readily aggregate or combine them into larger units; it is not possible without additional information or theory to disaggregate them into smaller units. Even given observations measured at very small units, however, there is still the problem of deciding in what way the units should be aggregated. This is known as the modifiable areal unit problem (MAUP, or MTUP in the case of temporal scale). Various techniques have been developed to study the implications of MAUP (Openshaw 1983).
Scale in Geography 1.3 Phenomenon Scale Phenomenon scale refers to the size at which geographic structures exist and over which geographic processes operate in the world. It is the ‘true’ scale of geographic phenomena. Determining the scale of phenomena is clearly a major research goal in geography. It is a common geographic dictum that scale matters. Numerous concepts in geography reflect the idea that phenomena are scale-dependent or are defined in part by their scale. Vegetation stands are smaller than vegetation regions, and linguistic dialects are distributed over smaller areas than languages. The possibility that some geographic phenomena are scale independent is important, however. Patterns seen at one scale may often be observed at other scales; whether this is a matter of analogy or of the same processes operating at multiple scales is theoretically important. The mathematics of fractals has been applied in geography as a way of understanding and formalizing phenomena such as coastlines that are self-similar at different scales (Lam and Quattrochi 1992). The belief has often been expressed that the discipline of geography, as the study of the earth as the home of humanity, can be defined partially by its focus on phenomena at certain scales, such as cities or continents, and not other scales. The range of scales of interest to geographers are often summarized by the use of terminological continua such as ‘local-global’ or ‘micro-, meso-, macroscale.’ The view that geographers must restrict their focus to particular ranges of scales is not shared universally, however, and advances have and will continue to occur when geographers stretch the boundaries of their subject matter. Nonetheless, few would argue that subatomic or interplanetary scales are properly of concern for geography. It is widely recognized that various scales of geographic phenomena interact, or that phenomena at one scale emerge from smaller or larger scale phenomena. This is captured by the notion of a ‘hierarchy of scales,’ in which smaller phenomena are nested within larger phenomena. Local economies are nested within regional economies, rivers are nested within larger hydrologic systems. Conceptualizing and modeling such scale hierarchies can be quite difficult, and the traditional practice within geography of focusing on a single scale largely continues.
2. Generalization The world can never be studied, modeled, or represented in all of its full detail and complexity. Scale is important in part because of its consequences for the degree to which geographic information is generalized. Generalization refers to the amount of detail included in information; it is essentially an issue of simplifi-
cation, but also includes aspects of selection and enhancement of features of particular interest. As one studies or represents smaller pieces of the earth, one tends strongly to deal with more detailed or more finegrained aspects of geographic features. For example, large-scale maps almost always show features on the earth’s surface in greater detail than do small-scale maps; rivers appear to meander more when they are shown on large-scale maps, for instance. Studied most extensively by cartographers, generalization is in fact relevant to all three meanings of scale, and to all three domains of spatial, temporal, and thematic scale.
3. Conclusion Issues of scale have always been central to geographic theory and research. Advances in the understanding of scale and the ability to investigate scale-related problems will continue, particularly with the increasingly common representation of geographic phenomena through the medium of digital geographic information (Goodchild and Proctor 1997). Cartographic scale is becoming ‘visualization’ scale. How is scale, spatial and temporal, communicated in dynamic, multidimensional, and multimodal representations, including visualization in virtual environments? Progress continues on the problem of automated generalization, programming intelligent machines to make generalization changes in geographic data as scale changes. The ability to perform multiscale and hierarchical analysis will be developed further. More profound than these advances, however, the widespread emergence of the ‘digital world’ will foster new conceptions of scale in geography.
Bibliography Buttenfield B P, McMaster R B (eds.) 1991 Map Generalization: Making Rules for Knowledge Representation. Wiley, New York Goodchild M F, Proctor J 1997 Scale in a digital geographic world. Geographical and Enironmental Modeling 1: 5–23 Hudson J C 1992 Scale in space and time. In: Abler R F, Marcus M G, Olson J M (eds.) Geography’s Inner Worlds: Perasie Themes in Contemporary American Geography. Rutgers University Press, New Brunswick, NJ Lam N S-N, Quattrochi D A 1992 On the issues of scale, resolution, and fractal analysis in the mapping sciences. The Professional Geographer 44: 88–98 MacEachren A M 1995 How Maps Work: Representation, Visualization, and Design. Guilford Press, New York Meyer W B, Gregory D, Turner B L, McDowell P F 1992 The local-global continuum. In: Abler R F, Marcus M G, Olson J M (eds.) Geography’s Inner Worlds: Perasie Themes in Contemporary American Geography. Rutgers University Press, New Brunswick, NJ
13503
Scale in Geography Muehrcke P C, Muehrcke J O 1992 Map Use: Reading, Analysis, Interpretation, 3rd edn. JP Publications, Madison, WI Openshaw S 1983 The Modifiable Areal Unit Problem. Geo Books, Norfolk, UK
D. R. Montello
and practical consequences. For example, if only violent crimes are subject to classification as homicides, then ‘homicide’ is a kind of ‘violent crime,’ and deaths caused by executive directives to release deadly pollution could not be homicides. 1.2 Typologies
Scaling and Classification in Social Measurement
Classification assimilates perceived phenomena into symbolically labeled categories. Anthropological studies of folk classification systems (D’Andrade 1995) have advanced understanding of scientific classification systems, though scientific usages involve criteria that folk systems may not meet entirely. Two areas of social science employ classification systems centrally. Qualitative analyses such as ethnographies, histories, case studies, etc. offer classifications—sometimes newly invented—for translating experiences in unfamiliar cultures or minds into familiar terms. Censuses of individuals, of occurrences, or of aggregate social units apply classifications—usually traditional—in order to count entities and their variations. Both types of work depend on theoretical constructions that link classification categories.
A typology differentiates entities at a particular level of a taxonomy in terms of one or more of their properties. The differentiating property (sometimes called a feature or attribute) essentially acts as a modifier of entities at that taxonomic level. For example, in the USA kinship system siblings are distinguished in terms of whether they are male or female; in Japan by comparison, siblings are schematized in terms of whether they are older as well as whether they are male or female. A scientific typology differentiates entities into types that are exclusive and exhaustive: every entity at the relevant taxonomic level is of one defined type only, and every entity is of some defined type. A division into two types is a dichotomy, into three types a trichotomy, and into more than three types a polytomy. Polytomous typologies are often constructed by crossing multiple properties, forming a table in which each cell is a theoretical type. (The crossed properties might be referred to as variables, dimensions, or factors in the typology.) For example, members of a multiplex society have been characterized according to whether they do or do not accept the society’s goals on the one hand, and whether they do or do not accept the society’s means of achieving goals on the other hand; then crossing acceptance of goals and means produces a fourfold table defining conformists and three types of deviants. Etic–emic analysis involves defining a typology with properties of scientific interest (the etic system) and then discovering ethnographically which types and combinations of types are recognized in folk meanings (the emic system). Latent structure analysis statistically processes observed properties of a sample of entities in order to confirm the existence of hypothesized types and to define the types operationally.
1.1 Taxonomies
1.3 Aggregate Entities
Every classification category is located within a taxonomy. Some more general categorization, Y, determines which entities are in the domain for the focal categorization, X; so an X always must be a kind of Y. ‘X is a kind of Y’ is the linguistic frame for specifying taxonomies. Concepts constituting a taxonomy form a logic tree, with subordinate elements implying superordinate items. Taxonomic enclosure of a classification category is a social construction that may have both theoretical
Aggregate social entities such as organizations, communities, and cultures may be studied as unique cases, where measurements identify and order internal characteristics of the entity rather than relate one aggregate entity to another. A seeming enigma in social measurement is how aggregate social entities can be described satisfactorily on the basis of the reports of relatively few informants, even though statistical theory calls for substantial samples of respondents to survey populations. The
Social measurements translate observed characteristics of individuals, events, relationships, organizations, societies, etc. into symbolic classifications that enable reasoning of a verbal, logical, or mathematical nature. Qualitative research and censuses together define one realm of measurement, concerned with assignment of entities to classification categories embedded within taxonomies and typologies. Another topic in measurement involves scaling discrete items of information such as answers to questions so as to produce quantitative measurements for mathematical analyses. A third issue is the linkage between social measurements and social theories.
1. Classifications
13504
Scaling and Classification in Social Measurement key is that informants all report on the same thing—a single culture, community, or organization—whereas respondents in a social survey typically report on diverse things—their own personal characteristics, beliefs, or experiences. Thus, reports from informants serve as multiple indicators of a single state, and the number needed depends on how many observations are needed to define a point reliably, rather than how many respondents are needed to describe a population’s diversity reliably. As few as seven expert informants can yield reliable descriptions of aggregate social entities, though more are needed as informants’ expertise declines (Romney et al. 1986). Informant expertise correlates with greater intelligence and experience (D’Andrade 1995) and with having a high level of social integration (Thomas and Heise 1995).
2. Relations Case grammar in linguistics defines events and relationships in terms of an actor, action, object, and perhaps instrumentation, setting, products, and other factors as well. Mapping sentences (Shye et al. 1994) apply the case grammar idea with relatively small lists of entities in order to classify relational phenomena within social aggregates. For example, interpersonal relations in a group can be specified by pairing group members with actions such as loves, admires, annoys, befriends, and angers. Mapping sentences defining the relations between individuals or among social organizations constitute the measurement model for social network research.
3. Scaling Quantitative measurements differentiate entities at a given taxonomic level—serving like typological classifications, but obtaining greater logical and mathematical power by ordering the classification categories. An influential conceptualization (Stevens 1951) posited four levels of quantification in terms of how numbers relate to classification categories. Nominal numeration involves assigning numbers arbitrarily simply to give categories unique names, such as ‘batch 243.’ An ordinal scale’s categories are ordered monotonically in terms of greater-than and less-than, and numbering corresponds to the rank of each category. Numerical ranking of an individual’s preferences for different foods is an example of ordinal measurement. Differences between categories can be compared in an interal scale, and numbers applied to categories reflect degrees of differences. Calendar dates are an example of interval measurements—we know from their birth years that William Shakespeare was closer in time to Geoffrey Chaucer than Albert Einstein was to Isaac Newton. In a ratio scale categories have magnitudes that are whole or fractional multiples of one another,
and numbers assigned to the categories represent these magnitudes. Population sizes are an example of ratio measurements—knowing the populations of both nations, we can say that Japan is at least 35 times bigger than Jamaica. A key methodological concern in psychometrics (Hopkins 1998) has been: how do you measure entities on an interval scale given merely nominal or ordinal information? 3.1 Scaling Dichotomous Items Nominal data are often dichotomous yes–no answers to questions, a judge’s presence–absence judgments about the features of entities, an expert’s claims about truth–falsity of propositions, etc. Answers of yes, present, true, etc. typically are coded as ‘one’ and no, absent, false, etc. as ‘zero.’ The goal, then, is to translate zero-one answers for each case in a sample of entities into a number representing the case’s position on an interval scale of measurement. The first step requires identifying how items relate to the interval scale of measurement in terms of a graph of the items’ characteristic curves. The horizontal axis of such a graph is the interval scale of measurement, confined to the practical range of variation of entities actually being observed. The vertical axis indicates probability that a specific dichotomous item has the value one for an entity with a given position on the interval scale of measurement. An item’s characteristic curve traces the changing probability of the item having the value one as an entity moves from having a minimal value on the interval scale to having the maximal value on the interval scale. Item characteristic curves have essentially three different shapes, corresponding to three different formulations about how items combine into a scale. Spanning items have characteristic curves that essentially are straight lines stretching across the range of entity variation. A spanning item’s line may start as a low probability value and rise to a high probability value, or fall from a high value to a low value. A rising line means that the item is unlikely to have a value of one with entities having a low score on the interval scale; the item is likely to have a score of one for entities having a high score on the interval scale; and the probability of the item being valued at one increases regularly for entities between the low and high positions on the scale. Knowing an entity’s value on any one spanning item does not permit assessing the entity’s position along the interval scale. However, knowing the entity’s values on multiple spanning items does allow an estimate of positioning to be made. Suppose heuristically that we are working with a large number of equivalent spanning items, each having an item characteristic curve that starts at probability 0.00 at the minimal point of the interval scale, and rises in a 13505
Scaling and Classification in Social Measurement straight line to probability 1.00 at the maximal point of the interval scale. The probability of an item being valued at one can be estimated from the observed proportion of all these items that are valued at one—which is simply the mean item score when items are scored zero-one. Then we can use the characteristic curve for the items to find the point on the interval scale where the entity must be positioned in order to have the estimated item probability. This is the basic scheme involved in construction of composite scales, where an averaged or summated score on multiple items is used to estimate an entity’s interval-scale value on a dimension of interest (Lord and Novick 1968). The more items that are averaged, the better the estimate of an entity’s position on the interval scale. The upper bound on number of items is pragmatic, determined by how much precision is needed and how much it costs to collect data with more items. The quality of the estimate also depends on how ideal the items are in terms of having straight-line characteristic curves terminating at the extremes of probability. Irrelevant items with a flat characteristic curve would not yield an estimate of scale position no matter how many of them are averaged, because a flat curve means that the probability of the item having a value of one is uncorrelated with the entity’s position on the interval scale. Inferences are possible with scales that include relevant but imperfect items, but more items are required to achieve a given level of precision, and greater weight needs to be given to the more perfect items. Decliitous items have characteristic curves that rise sharply at a particular point on the horizontal axis. Idealized, the probability of the item having a value of one increases from 0.00 to the left of the inflection point to 1.00 to the right of the inflection point; or alternatively the probability declines from 1.00 to 0.00 in passing the inflection point. Realistically, the characteristic curve of a declivitous item is S-shaped with a steep rise in the middle and graduated approaches to 0.00 at the bottom and to 1.00 at the top. The value of a single declivitous item tells little about an entity’s position along the interval scale. However, an inference about an entity’s scale position can be made from a set of declivitous items with different inflection points, or difficulties, that form a cumulatie scale. Suppose heuristically that each item increases stepwise at its inflection point. Then for an entity midway along the horizontal axis, items at the left end of the scale will all have the value of one, items at the right end of the scale will all have the value zero, and the entity’s value on the interval scale is between the items with a score of one and the items with a score of zero. If the items’ inflection points are evenly distributed along the interval scale, then the sum of items’ zeroone scores for an entity constitutes an estimate of where the entity is positioned along the interval scale. 13506
That is, few of the items have a value of one if the entity is on the lower end of the interval scale, and many of the items are valued at one if the entity is at the upper end of the interval scale. This is the basic scheme involved in Guttman scalogram analysis (e.g., see Shye 1978, Part 5). On the other hand, we might use empirical data to estimate the position of each item’s inflection point on the interval scale, while simultaneously estimating entity scores that take account of the item difficulties. This is the basic scheme involved in scaling with Rasch models (e.g., Andrich 1988). Entities’ positions on the interval scale can be pinned down as closely as desired through the use of more declivitous items with inflection points spaced closer and closer together. However, adding items to achieve more measurement precision at the low end of the interval scale does not help at the middle or the high end of the interval scale. Thus, obtaining high precision over the entire range of variation requires a large number of items, and it could be costly to obtain so much data. Alternatively, one can seek items whose characteristic curves rise gradually over a range of the interval scale such that sequential items on the scale have overlapping characteristic curves, whereby an entity’s position along the interval scale is indicated by several items. Regional items have characteristic curves that rise and fall within a limited range of the interval scale. That is, moving an entity up the interval scale increases the probability of a particular item having a value of one for a while, and then decreases the probability after the entity passes the characteristic curve’s maximum value. For example, in a scale measuring prejudice toward a particular ethnic group, the probability of agreeing with the item ‘they require equal but separate facilities’ increases as a person moves away from an apartheid position, and then decreases as the person moves further up the scale toward a nondiscriminatory position. A regional item’s characteristic curve is approximately bell-shaped if its maximum is at the middle of the interval scale, but characteristic curves at the ends of the scale are subject to floor and ceiling clipping, making them look like declivitous items. If an entity has a value of one on a regional item, then the entity’s position along the interval scale is known approximately, since the entity must be positioned in the part of the scale where that item has a nonzero probability of being valued at one. However, a value of zero on the same item can result from a variety of positions along the interval scale and reveals little about the entity’s position. Thus, multiple regional items have to be used to assess positions along the whole range of the scale. The items have to be sequenced relatively closely along the scale with overlapping characteristic curves so that no entity will end up in the noninformative state of having a zero value on all items.
Scaling and Classification in Social Measurement One could ask judges to rate the scale position of each item, and average across judges to get an item score; then, later, respondents can be scored with the average of the items they accept. This simplistic approach to regional items was employed in some early attempts to measure social attitudes. Another approach is statistical unfolding of respondents’ choices of items on either side of their own positions on the interval scale in order to estimate scale values for items and respondents simultaneously (Coombs 1964). Item analysis is a routine aspect of scale construction with spanning items, declivitous items, or regional items. One typically starts with a notion of what one wants to measure, assembles items that should relate to the dimension, and tests the items in order to select the items that work best. Since a criterion measurement that can be used for assessing item quality typically is lacking, the items as a group are assumed to measure what they are supposed to measure, and scores based on this assumption are used to evaluate individual items. Items in a scale are presumed to measure a single dimension rather than multiple dimensions. Examining the dimensionality assumption brings in additional technology, such as component analysis or factor analysis in the case of spanning items, multidimensional scalogram analysis in the case of declivitous items, and nonmetric multidimensional scaling in the case of regional items. These statistical methods help in refining the conception of the focal dimension and in selecting the best items for measuring that dimension.
logies, like Rasch scaling and nonmetric multidimensional scaling, can be interpreted within the conjoint analysis framework. Magnitude estimations involve comparisons to an anchor, for example: ‘Here is a reference sound. … How loud is this next sound relative to the reference sound.’ Trained judges using such a procedure can assess intensities of sensations and of a variety of social opinions on ratio scales (Stevens 1951, Lodge 1981). Comparing magnitude estimation in social surveys to the more common procedure of obtaining ratings on category scales with a fixed number of options, Lodge (1981) found that magnitude estimations are more costly but more accurate, especially in registering extreme positions. Rating scales with bipolar adjective anchors like good–bad are often used to assess affective meanings of perceptions, individuals, events, etc. Such scales traditionally provided seven or nine answer positions between the opposing poles with the middle position defined as neutral. Computerized presentations of such scales with hundreds of rating positions along a graphic line yield greater precision by incorporating some aspects of magnitude estimation. Cross-cultural and cross-linguistic research in dozens of societies has demonstrated that bipolar rating scales align with three dimensions—evaluation, potency, and activity (Osgood et al. 1975). An implication is that research employing bipolar rating scales should include scales measuring the standard three dimensions in order to identify contributions of these dimensions to rating variance on other bipolar scales.
4. Measurements and Theory 3.2 Ordered Assessments Ordinal data—the starting point for a three-volume mathematical treatise on measurement theory (Krantz et al. 1971, Suppes et al. 1989, Luce et al. 1990)—may arise from individuals’ preferences, gradings of agreement with statements, reckonings of similarity between stimuli, etc. Conjoint analysis (Luce and Tukey 1964, Michell 1990) offers a general mathematical model for analyzing such data. According to the conjoint theory of measurement, positions on any viable quantitative dimension are predictable from positions on two other quantitative dimensions, and this assumption leads to tests of a dimension’s usefulness given just information of an ordinal nature. For example, societies might be ranked in terms of their socioeconomic development and also arrayed in terms of the extents of their patrifocal technologies (like herding) and matrifocal technologies (like horticulture), each of which contributes additively to socioeconomic development. Conjoint analyses could be conducted to test the meaningfulness of these dimensions, preliminarily to developing interval scales of socioeconomic development and patrifocal and matrifocal technologies. Specific scaling methodo-
Theories and measurements are bound inextricably. In the first place, taxonomies and typologies—which are theoretical constructions, even when rooted in folk classification systems—are entailed in defining which entities are to be measured, so are part of all measurements. Second, scientists routinely assume that any particular measurement is wrong to some degree—even a measurement based on a scale, and that combining multiple measurements improves measurement precision. The underlying premise is that a true value exists for that which is being measured, as opposed to observed values, and that theories apply to true values, not ephemeral observed values. In essence, theories set expectations about what should be observed, in contrast to what is observed, and deviation from theoretical expectations is interpreted as measurement error (Kyburg 1992). A notion of measurement error as deviation from theoretical expectations is widely applicable, even in qualitative research (McCullagh and Behan 1984, Heise 1989). Third, the conjoint theory of measurement underlying many current measurement technologies requires theoretical specification of relations between variables 13507
Scaling and Classification in Social Measurement before the variables’ viability can be tested or meaningful scales constructed. This approach is in creative tension with traditional deductive science, wherein variable measurements are gathered in order to determine if relations among variables exist and theories are correct. In fact, a frequent theme in social science during the late twentieth century was that measurement technology had to improve in order to foster the growth of more powerful theories. It is clear now that the dependence between measurements and theories is more bidirectional than was supposed before the development of conjoint theory. See also: Classification: Conceptions in the Social Sciences; Dimensionality of Tests: Methodology; Factor Analysis and Latent Structure: IRT and Rasch Models; Test Theory: Applied Probabilistic Measurement Structures
Bibliography Andrich D 1988 Rasch Models for Measurement. Sage, Newbury Park, CA Coombs C H 1964 A Theory of Data. John Wiley & Sons, New York D’Andrade R 1995 The Deelopment of Cognitie Anthropology. Cambridge University Press, New York Heise D R 1989 Modeling event structures. Journal of Mathematical Sociology 14: 139–69 Hopkins K D 1998 Educational and Psychological Measurement and Ealuation, 8th edn. Allyn and Bacon, Boston Krantz D H, Luce R D, Suppes P, Tversky A 1971 Foundations of Measurement. Volume 1: Additie and Polynomial Representations. Academic Press, New York Kyburg Jr. H E 1992 Measuring errors of measurement. In: Savage C W, Ehrlich P (eds.) Philosophical and Foundational Issues in Measurement. Lawrence Erlbaum Associates, Hillsdale, NJ, pp. 75–91 Lodge M 1981 Magnitude Scaling: Quantitatie Measurement of Opinions. Sage, Beverly Hills, CA Lord F N, Novick M R 1968 Statistical Theories of Mental Test Scores. Addison-Wesley, Reading, MA Luce R D, Krantz D H, Suppes P, Tversky A 1990 Foundations of Measurement. Volume 3: Representation, Axiomatization, and Inariance. Academic Press, New York Luce R D, Tukey J W 1964 Simultaneous conjoint measurement: A new type of fundamental measurement. Journal of Mathematical Psychology 1: 1–27 McCullagh B C, Behan C 1984 Justifying Historical Descriptions. Cambridge University Press, New York Michell J 1990 An Introduction to the Logic of Psychological Measurement. Lawrence Erlbaum Associates, Hillsdale, NJ Osgood C, May W H, Miron M S 1975 Cross-cultural Uniersals of Affectie Meaning. University of Illinois Press, Urbana, IL Romney A K, Weller S C, Batchelder W H 1986 Culture as consensus: A theory of culture and informant accuracy. American Anthropologist 88: 313–38 Shye S (ed.) 1978 Theory Construction and Data Analysis in the Behaioral Sciences, 1st edn. Jossey-Bass, San Francisco Shye S, Elizur D, Hoffman M 1994 Introduction to Facet Theory: Content Design and Intrinsic Data Analysis in Behaioral Research. Sage, Thousand Oaks, CA
13508
Stevens S S 1951 Mathematics, measurement, and psychophysics. In: Stevens S S (ed.) Handbook of Experimental Psychology. John Wiley and Sons, New York, pp. 1–49 Suppes P, Krantz D H, Luce R D, Tversky A 1989 Foundations of Measurement. Volume 2: Geometrical, Threshold, and Probabilistic Representations. Academic Press, New York Thomas L, Heise D R 1995 Mining error variance and hitting pay-dirt: Discovering systematic variation in social sentiments. The Sociological Quarterly 36: 425–39
D. R. Heise
Scaling: Correspondence Analysis In the early 1960s a dedicated group of French social scientists, led by the extraordinary scientist and philosopher Jean-Paul Benze! cri, developed methods for structuring and interpreting large sets of complex data. This group’s method of choice was correspondence analysis, a method for transforming a rectangular table of data, usually counts, into a visual map which displays rows and columns of the table with respect to continuous underlying dimensions. This article introduces this approach to scaling, gives an illustration and indicates its wide applicability. Attention is limited here to the descriptive and exploratory uses of correspondence analysis methodology. More formal statistical tools have recently been developed and are described in Multiariate Analysis: Discrete Variables (Correspondence Models).
1. Historical Background Benze! cri’s contribution to data analysis in general and to correspondence analysis in particular was not so much in the mathematical theory underlying the methodology as in the strong attention paid to the graphical interpretation of the results and in the broad applicability of the methods to problems in many contexts. His initial interest was in analyzing large sparse matrices of word counts in linguistics, but he soon realized the power of the method in fields as diverse as biology, archeology, physics, and music. The fact that his approach paid so much attention to the visualization of data, to be interpreted with a degree of ingenuity and insight into the substantive problem, fitted perfectly the esprit geT ometrique of the French and their tradition of visual abstraction and creativity. Originally working in Rennes in western France, this group consolidated in Paris in the 1970s to become an influential and controversial movement in post1968 France. In 1973 they published the two fundamental volumes of, L’Analyse des DonneT es (Data Analysis), the first on La Classification, that is,
Scaling: Correspondence Analysis unsupervised classification or cluster analysis, and the second on, L’Analyse des Correspondances, or correspondence analysis (Benze! cri 1973), as well as from 1977 the journal Les Cahiers de l’Analyse des DonneT es, all of which reflect the depth and diversity of Benze! cri’s work. For a more complete historical account of the origins of correspondence analysis, see Nishisato (1980), Greenacre (1984), and Gifi (1990).
2. Correspondence Analysis Correspondence analysis (CA) is a variant of principal components analysis (PCA) applicable to categorical data rather than interval-level measurement data (see Factor Analysis and Latent Structure: Oeriew). For example, Table 1 is a contingency table obtained from the 1995 International Social Survey Program (ISSP) survey on national identity, tabulating responses from 23 countries on the question: ‘How much do you agree\disagree with the statement: Generally (respondent’s country) is a country better than most other countries?’ (For example, Austrians are asked to evaluate the statement: Generally Austria is a country better than most other countries.) The object of CA is to obtain a graphical display in the form of a spatial map of the rows (countries) and columns (question responses), where the dimensions of the map as well as the specific positions of the row and column points can be interpreted.
The theory of CA can be summarized by the following steps: (a) Let N be the IiJ table with grand total n and let P l (1\n) N be the correspondence matrix, with grand total equal to 1. CA actually analyzes the correspondence matrix, which is free of the sample size. If N is a contingency table, then P is an observed bivariate discrete distribution. (b) Let r and c be the vectors of row and column sums of P respectively and Dr and Dc diagonal matrices with r and c on the diagonal. (c) Compute the singular value decomposition of the centred and standardized matrix with general element (pijkricj)\Nricj: D−r "/# (Pkr cV )D−c "/# l UDαVV
(1)
where the singular values are in descending order: α α … and U V U l VV V l I. " (d) #Compute the standard coordinates X and Y: X l D−r "/# U Y l D−c "/# V
(2)
and principal coordinates F and G: F l X Dα
G l Y Dα
(3)
Notice the following: The results of CA are in the form of a map of points representing the rows and columns with respect to a
Table 1 Responses in 23 countries to question on national pride. Source: ISSP 1995 Country Austria Bulgaria Canada Czech Republic Former E. Germany Former W. Germany Great Britain Hungary Ireland Italy Japan Latvia Netherlands Norway New Zealand Poland Philippines Russia Slovakia Slovenia Spain Sweden USA
Agree strongly
Agree
Can’t decide
Disagree
Disagree strongly
Missing
TOTAL RESPONSE
272 208 538 73 59 121 151 66 167 70 641 92 158 258 283 153 152 272 100 61 71 140 525
387 338 620 156 138 321 408 175 529 324 398 190 758 723 499 378 556 297 199 204 343 426 554
184 163 223 372 138 369 282 268 149 298 139 215 545 334 170 396 260 352 384 258 320 375 168
79 112 99 289 142 232 139 258 120 281 36 225 417 114 41 365 214 307 338 366 372 167 61
33 129 33 158 71 139 25 143 14 98 24 153 129 30 8 80 10 134 267 64 42 80 21
52 155 30 63 64 100 53 90 15 23 18 169 82 68 42 226 8 223 100 83 73 108 38
1007 1105 1543 1111 612 1282 1058 1000 994 1094 1256 1044 2089 1527 1043 1598 1200 1585 1388 1036 1221 1296 1367
13509
Scaling: Correspondence Analysis selected pair of principal axes, corresponding to pairs of columns of the coordinate matrices—usually the first two columns for the first two principal axes. The choice between principal and standard coordinates is described below. The total variance, called inertia, is equal to the sum of squares of the matrix decomposed in (1): ( p ijkr i c j )#\( r i cj ) i
(4)
j
which is the Pearson chi-squared statistic calculated on the original table divided by n (see Multiariate Analysis: Discrete Variables (Oeriew)). The squared singular values α #, α#,…, called the # principal inertias, decompose the "inertia into parts attributable to the respective principal axes, just as in PCA the total variance is decomposed along principal axes. The most popular type of map, called the symmetric map, uses the first two columns of F for the row coordinates and the first two columns of G for the column coordinates, that is both in principal coordinates as given by (3). An alternative scaling, which has a more coherent geometric interpretation, but less aesthetic appearance, is the asymmetric map, for example, rows in principal coordinates F and columns in standard coordinates Y in (2) (or vice versa). The choice between a row-principal or column-principal asymmetric map is governed by whether the original table is considered as a set of rows or a set of columns, respectively, when expressed in percentage form. The positions of the rows and the columns in a map are projections of points, called profiles, from their true positions in high-dimensional space onto a bestfitting lower-dimensional space. A row or column profile is the corresponding row or column of the table divided by its respective total—in the case of a contingency table the profile is a conditional frequency distribution. Each profile is weighted by a mass equal to the value of the corresponding row or column margin, ri or cj. The space of the profiles is structured by a weighted Euclidean distance function called the chi-squared distance and the optimal map is obtained by fitting a lower-dimensional space which fits the profiles by weighted least-squares. Equivalent forms of (4) which show the use of profile, mass, and chi-squared distance are: ri ( i
j
p ij p kcj )#\cj l cj ( ijkri)#\ri ri cj j i
(5)
Thus the inertia is a weighted average squared distance between the profile vectors (e.g., prij, j l 1,… for a row i profile, weighted by the mass ri) and their respective average (e.g., cj, j l 1,…, the average row profile), where the distance is of a weighted Euclidean form (e.g., with inverse weighting of the j-th term by cj). 13510
An equivalent definition of CA is as a pair of classical scaling problems, one for the rows and one for the columns. For example, a square symmetric matrix of chi-squared distances can be calculated between the row profiles, with each point weighted by its respective row mass. Applying classical scaling (also known as principal coordinate analysis, see Scaling: Multidimensional) to this distance matrix, and taking the row masses into account, leads to the row principal coordinates in CA. The singular value decomposition (SVD) may be written in terms of the standard coordinates in the following equivalent form, for the (i, j)-th element: p ijkr i c j l r i c j (1j α k x ik y jk )
(6)
k
which shows that CA can be considered as a bilinear model (see Multiariate Analysis: Discrete Variables (Correspondence Models)). For any particular solution, for example in two dimensions where the first two terms of this decomposition are retained, the residual elements have been minimized by weighted least-squares.
3. Application The symmetric map of Table 1, with rows and columns in principal coordinates, is given in Fig. 1. Looking at the positions of the question responses first with respect to the first (horizontal) principal axis, they are seen to lie in their substantive order from ‘strongly disagree’ on the left to ‘strongly agree’ on the right, with ‘missing’ on the disagreement side. The scale values of these categories constitute an optimal scale, by which is meant a standardized interval scale for the categorical variable of agreement—disagreement, including the missing value category, which optimally discriminates between the 23 countries, that is which gives maximum between-country variance. In the two-dimensional map the response category points form a curve known as the ‘horseshoe’ or ‘arch’ which is fairly common for data on an ordinal scale. The second dimension then separates out polarized groups towards the top, or inside the arch, where both extremes of the response scale lie, as well as the missing response in this case. Turning attention to the countries now, they will line up from left to right in an ordination of agreement induced by the scale values. The five Eastern Bloc countries lie on the left, unfavorable, extreme of the map, with Japan, Canada, USA, and New Zealand at the other, favorable side. The countries generally follow the curved shape as well, but a country such as Bulgaria which lies in a unique position inside the curve is characterized by a relatively high polarization of responses as well as high missing values. Bulgaria’s position in the middle of the first axis contrasts with the position of Great Britain, for example, which is
Scaling: Correspondence Analysis
Figure 1 Symmetric correspondence analysis map of Table 1
also in the middle but due to responses more in the intermediate categories of the scale rather than a mixture of extreme responses. The principal inertias are indicated in Fig. 1 at the positive end of each axis and the quality of the display is measured by adding together the percentages of inertia, 68.6 percentj17.8 percent l 86.4 percent. This means that there is a ‘residual’ of 13.6 percent not depicted in the map, which can only be interpreted by investigating the next dimensions from third onwards. This 13.6 percent is the percentage of inertia minimized in the weighted least-squares solution of CA in two dimensions.
4. Contributions to Inertia Apart from assessing the quality of the map by the percentages of inertia, other more detailed diagnostics in CA are the so-called contributions to inertia, based on the two decompositions of the total inertia, first by rows and second by columns: # ( p ijkr i c j )#\( r i c j ) l α k# l r i f ik i
j
k
k
i
# l c j gjk k
j
# , (respectively, column Every row component r i f ik component c j g#jk ) can be expressed relative to the
principal inertia αk# of the corresponding dimension k, where α#k is the sum of these components for all the rows (respectively, columns). These relative values provide a diagnostic for deciding which rows (respectively, columns) are important in the determination of the k-th principal axis. In a similar way, for a fixed row each row compon# (respectively, column component c g# for a ent r i f ik j jk fixed column) can be expressed relative to the total # (respectively, c g # ) across all principal r i f ik k k j jk axes. These relative values provide a diagnostic for deciding which axes are important in explaining each row or column. These values are analogous to the squared factor loadings in factor analysis, that is, squared correlations between the row or column and the corresponding principal axis or factor (see Factor Analysis and Latent Structure: Oeriew).
5. Extensions Although the primary application of CA is to a twoway contingency table, the method is regularly applied to analyze multiway tables, tables of preferences, ratings, as well as measurement data on ratio- or interval-level scales. For multiway tables there are two approaches. The first approach is to convert the table to a flat two-way table which is appropriate to the problem at hand. Thus, if a third variable is introduced into the example above, say ‘sex of respondent,’ then an appropriate way to flatten the three-way table 13511
Scaling: Correspondence Analysis would be to interactively code ‘country’ and ‘sex’ as a new row variable, with 23i2 l 46 categories, crosstabulated against the question responses. For each country there would now be a male and a female point and one could compare sexes and countries in this richer map. This process of interactive coding of the variables can continue as long as the data do not become too fragmented into interactive categories of very low frequency. Another approach to multiway data, called multiple correspondence analysis (MCA), applies when there are several categorical variables skirting the same issue, often called ‘items.’ MCA is usually defined as the CA algorithm applied to an indicator matrix Z with the rows being the respondents or other sampling units, and the columns being dummy variables for each of the categories of all the variables. The data are zeros and ones, with the ones indicating the chosen categories for each respondent. The resultant map shows each category as a point and, in principle, the position of each respondent as well. Alternatively, one can set up what is called the Burt matrix), B l ZVZ, the square symmetric table of all two-way crosstabulations of the variables, including the crosstabulations of each variable with itself (named after the psychologist Sir Cyril Burt). The Burt matrix is reminiscent of a covariance matrix and the CA of the Burt matrix can be likened to a PCA of a covariance matrix. The analysis of the indicator matrix Z and the Burt matrix B give equivalent standard coordinates of the category points, but slightly different scalings in the principal coordinates since the principal inertias of B are the squares of those of Z. A variant of MCA called joint correspondence analysis (JCA) avoids the fitting of the tables on the diagonal of the Burt matrix, which is analogous to least-squares factor analysis. As far as other types of data are concerned, namely rankings, ratings, paired comparisons, ratio-scale, and interval-scale measurements, the key idea is to recode the data in a form which justifies the basic constructs of CA, namely profile, mass, and chi-squared distance. For example, in the analysis of rankings, or preferences, applying the CA algorithm to the original rankings of a set of objects by a sample of subjects is difficult to justify, because there is no reason why weight should be accorded to an object in proportion to its average ranking. A practice called doubling resolves the issue by adding either an ‘anti-object’ for each ranked object or an ‘anti-subject’ for each responding subject, in both cases with rankings in the reverse order. This addition of apparently redundant data leads to CA effectively performing different variants of principal components analysis on the original rankings. A recent finding by Carroll et al. (1997) is that CA can be applied to a square symmetric matrix of squared distances, transformed by subtracting each squared distance from a constant which is substantially larger 13512
than the largest squared distance in the table. This yields a solution which approximates the classical scaling solution of the distance matrix. All these extensions of CA conform closely to Benze! cri’s original conception of CA as a universal technique for exploring many different types of data through operations such as doubling or other judicious transformations of the data. The latest developments on the subject, including discussions of sampling properties of CA solutions and a comprehensive reference list, may be found in the volumes edited by Greenacre and Blasius (1994) and Blasius and Greenacre (1998). See also: Factor Analysis and Latent Structure: Overview; Multivariate Analysis: Discrete Variables (Correspondence Models); Multivariate Analysis: Discrete Variables (Overview); Scaling: Multidimensional
Bibliography Benze! cri J-P 1973 L’Analyse des DonneT es Vol I: La Classification, Vol. II: L’Analyse des Correspondances. Dunod, Paris Blasius J, Greenacre M J 1998 Visualization of Categorical Data. Academic Press, San Diego, CA Carroll J D, Kumbasar E, Romney A K 1997 An equivalence relation between correspondence analysis and classical metric multidimensional scaling for the recovery of Euclidean distances. British Journal of Mathematical and Statistical Psychology 50: 81–92 Gifi A 1990 Nonlinear Multiariate Analysis. Wiley, Chichester, UK Greenacre M J 1984 Theory and Applications of Correspondence Analysis. Academic Press, London Greenacre M J 1993 Correspondence Analysis in Practice. Academic Press, London Greenacre M J, Blasius J 1994 Correspondence Analysis in the Social Sciences. Academic Press, London International Social Survey Program (ISSP) 1995 Surey on National Identity. Data set ZA 2880, Zentralarchiv fu$ r Empirische Sozialforschung. University of Cologne Lebart L, Morineau A, Warwick K 1984 Multiariate Descriptie Statistical Analysis. Wiley, Chichester, UK Nishisato S 1980 Analysis of Categorical Data: Dual Scaling and its Applications. University of Toronto Press, Toronto, Canada
M. Greenacre
Scaling: Multidimensional The term ‘Multidimensional Scaling’ or MDS is used in two essentially different ways in statistics (de Leeuw and Heiser 1980a). MDS in the wide sense refers to any technique that produces a multi-dimensional geo-
Scaling: Multidimensional metric representation of data, where quantitative or qualitative relationships in the data are made to correspond with geometric relationships in the representation. MDS in the narrow sense starts with information about some form of dissimilarity between the elements of a set of objects, and it constructs its geometric representation from this information. Thus the data are ‘dissimilarities,’ which are distance-like quantities (or similarities, which are inversely related to distances). This entry only concentrates on narrowsense MDS, because otherwise the definition of the technique is so diluted as to include almost all of multivariate analysis. MDS is a descriptive technique, in which the notion of statistical inference is almost completely absent. There have been some attempts to introduce statistical models and corresponding estimating and testing methods, but they have been largely unsuccessful. I introduce some quick notation. Dissimilarities are written as δij, and distances are dij(X ). Here i and j are the objects of interest. The nip matrix X is the configuration, with coordinates of the objects in p. Often, data weights wij are also available, reflecting the importance or precision of dissimilarity δij.
1. Sources of Distance Data Dissimilarity information about a set of objects can arise in many different ways. This article reviews some of the more important ones, organized by scientific discipline.
1.1 Geodesy The most obvious application, perhaps, is in sciences in which distance is measured directly, although generally with error. This happens, for instance, in triangulation in geodesy, in which measurements are made which are approximately equal to distances, either Euclidean or spherical, depending on the scale of the experiment. In other examples, measured distances are less directly related to physical distances. For example, one could measure airplane, road, or train travel distances between different cities. Physical distance is usually not the only factor determining these types of dissimilarities.
1.2 Geography\Economics In economic geography, or spatial economics, there are many examples of input–output tables, where the table indicates some type of interaction between a number of regions or countries. For instance, the data may have n countries, where entry fij indicates the number of tourists traveling, or the amount of grain
exported, from i to j. It is not difficult to think of many other examples of these square (but generally asymmetric) tables. Again, physical distance may be a contributing factor to these dissimilarities, but certainly not the only one.
1.3 Genetics\Systematics A very early application of a scaling technique was Fisher (1922). He used crossing-over frequencies from a number of loci to construct a (one-dimensional) map of part of the chromosome. Another early application of MDS ideas is in Boyden (1931), where reactions to sera are used to give similarities between common mammals, and these similarities are then mapped into three-dimensional space. In much of systematic zoology, distances between species or individuals are actually computed from a matrix of measurements on a number of variables describing the individuals. There are many measures of similarity or distance which have been used, not all of them having the usual metric properties. The derived dissimilarity or similarity matrix is analyzed by MDS, or by cluster analysis, because systematic zoologists show an obvious preference for tree representations over continuous representations in p.
1.4 Psychology\Phonetics MDS, as a set of data analysis techniques, clearly originates in psychology. There is a review of the early history, which starts with Carl Stumpf around 1880, in de Leeuw and Heiser (1980a). Developments in psychophysics concentrated on specifying the shape of the function relating dissimilarities and distances, until Shepard (1962) made the radical proposal to let the data determine this shape, requiring this function only to be increasing. In psychophysics, one of the basic forms in which data are gathered is the ‘confusion matrix.’ Such a matrix records how many times row-stimulus i was identified as column-stimulus j. A classical example is the Morse code signals studied by Rothkopf (1957). Confusion matrices are not unlike the input– output matrices of economics. In psychology (and marketing) researchers also collect direct similarity judgments in various forms to map cognitive domains. Ekman’s color similarity data is one of the prime examples (Ekman 1963), but many measures of similarity (rankings, ratings, ratio estimates) have been used.
1.5 Psychology\Political Science\Choice Theory Another source of distance information is ‘preference data.’ If a number of individuals indicate their prefer13513
Scaling: Multidimensional Table 1 Ten psychology journals
A B C D E F G H I J
Journal
Label
American Journal of Psychology Journal of Abnormal and Social Psychology Journal of Applied Psychology Journal of Comparatie and Physiological Psychology Journal of Consulting Psychology Journal of Educational Psychology Journal of Experimental Psychology Psychological Bulletin Psychological Reiew Psychometrika
AJP JASP JAP JCPP JCP JEP JExP PB PR Pka
ences for a number of objects, then many choice models use geometrical representations in which an individual prefers the object she is closer to. This leads to ordinal information about the distances between the individuals and the objects, e.g., between the politicians and the issues they vote for, or between the customers and the products they buy.
journals. The journals are given in Table 1. The actual data are in Table 2. the basic idea, of course, is that journals with many cross-references are similar.
3. Types of MDS There are two different forms of MDS, depending on how much information is available about the distances. In some of the applications reviewed in Sect. 1 the dissimilarities are known numbers, equal to distances, except perhaps for measurement error. In other cases only the rank order of the dissimilarities is known, or only a subset of them is known.
1.6 Biochemistry Fairly recently, MDS has been applied in the conformation of molecular structures from nuclear resonance data. The pioneering work is Crippen (1977), and a more recent monograph is Crippen and Havel (1988). Recently, this work has become more important because MDS techniques are used to determine protein structure. Numerical analysts and mathematical programmers have been involved, and as a consequence there have been many new and exciting developments in MDS.
3.1 Metric Scaling In metric scaling the dissimilarities between all objects are known numbers, and they are approximated by distances. Thus objects are mapped into a metric space, distances are computed, and compared with the dissimilarities. Then objects are moved in such a way that the fit becomes better, until some loss function is minimized. In geodesy and molecular genetics this is a reasonable procedure because dissimilarities correspond rather directly with distances. In analyzing input– output tables, however, or confusion matrices, such tables are often clearly asymmetric and not likely to be
2. An Example Section 1 shows that it will be difficult to find an example that illustrates all aspects of MDS. We select one that can be used in quite a few of the techniques discussed. It is taken from Coombs (1964, p. 464). The data are cross-references between ten psychological
Table 2 References in row-journal to column-journal
A B C D E F G H I J
13514
A
B
C
D
E
F
G
H
I
J
122 23 0 36 6 6 65 47 22 2
4 303 28 10 93 12 15 108 40 0
1 9 84 4 11 11 3 16 2 2
23 11 2 304 1 1 33 81 29 0
4 49 11 0 186 7 3 130 8 0
2 4 6 0 6 34 3 14 1 1
135 55 15 98 7 24 337 193 97 6
17 50 23 21 30 16 40 52 39 14
39 48 8 65 10 7 59 31 107 5
1 7 13 4 14 14 14 12 13 59
Scaling: Multidimensional directly translatable into distances. Such cases often require a model to correct for asymmetry and scale. The most common class of models (for counts in a square table) is E ( fij) l αi βj exp okφ(dij(X ))q, where φ is some monotone transformation through the origin. For φ equal to the identity this is known as the choice model for recognition experiments in mathematical psychology (Luce 1963), and as a variation of the quasi-symmetry model in statistics (Haberman 1974). The negative exponential of the distance function was also used by Shepard (1957) in his early theory of recognition experiments. As noted in Sect. 1.3, in systematic zoology and ecology, the basic data matrix is often a matrix in which n objects are measured on p variables. The first step in the analysis is to convert this into an nin matrix of similarities or dissimilarities. Which measure of (dis)similarity is chosen depends on the types of variables in the problem. If they are numerical, Euclidean distances or Mahanalobis distances can be used, but if they are binary other dissimilarity measures come to mind (Gower and Legendre 1986). In any case, the result is a matrix which can be used as input in a metric MDS procedure. 3.2 Nonmetric Scaling In various situations, in particular in psychology, only the rank order of the dissimilarities is known. This is either because only ordinal information is collected (for instance by using paired or triadic comparisons) or because, while the assumption is natural that the function relating dissimilarities and distances is monotonic, the choice of a specific functional form is not. There are other cases in which there is incomplete information. For example, observations may only be available on a subset of the distances, either by design or by certain natural restrictions on what is observable. Such cases lead to a distance completion problem, where the configuration is constructed from a subset of the distances, and at the same time the other (missing) distances are estimated. Such distance completion problems (assuming that the observed distances are measured without error) are currently solved with mathematical programming methods (Alfakih et al. 1998).
other. For instance, one can impose the restriction that the configurations are the same, but the transformation relating dissimilarities and distances are different. Or one could require that the projections on the dimensions are linearly related to each other in the sense that dij(Xk) l dij(XWk), where Wk is a diagonal matrix characterizing occasion k. A very readable introduction to three-way scaling is Arabie et al. (1987). 3.4 Unfolding In ‘multidimensional unfolding,’ information is only available about off-diagonal dissimilarities, either metric or nonmetric. This means dealing with two different sets of objects, for instance individuals and stimuli or members of congress and political issues, and dissimilarities between members of the first set and members of the second set, and not on the withinset dissimilarities. This typically happens with preference and choice data, in which how individuals like candies, or candidates like issues is known, but not how the individuals like other individuals, and so on. In many cases, the information in unfolding is also only ordinal. Moreover, it is ‘conditional,’ which means that while it is known that a politician prefers one issue over another, it is not known if a politician’s preference for an issue is stronger than another politician’s preference for another issue. Thus the ordinal information is only within rows of the offdiagonal matrix. This makes unfolding data, especially nonmetric unfolding data, extremely sparse. 3.5 Restricted MDS In many cases it makes sense to impose restrictions on the representation of the objects in MDS. The design of a study may be such that the objects are naturally on a rectangular grid, for instance, or on a circle or ellipse. Often, incorporating such prior information leads to a more readily interpretable and more stable MDS solution. As noted in Sect. 3.3, some of the more common applications of restricted MDS are to three-way scaling.
3.3 Three-way Scaling
4. Existence Theorem
In ‘three-way scaling’ information is available on dissimilarities between n objects on m occasions, or for m subjects. Two easy ways of dealing with the occasions is to perform either a separate MDS for each subject or to perform a single MDS for the average occasion. Three-way MDS constitutes a strategy between these two extremes. This technique requires computation of m MDS solutions, but they are required to be related to each
The basic existence theorem in Euclidean MDS, in matrix form, is due to Schoenberg (1935). A more modern version was presented in the book by Torgerson (1958). I give a simple version here. Suppose E is a nonnegative, hollow, symmetric matrix or order n, and suppose Jn l Inkn" enehn is the ‘centering’ operator. Here In is the identity, and en is a vector with all elements equal to one. Then E is a matrix of squared 13515
Scaling: Multidimensional Euclidean distances between n points in p if and only if k" JnEJn is positive semi-definite of rank less # than or equal to p. This theorem has been extended to the classical nonEuclidean geometries, for instance by Blumenthal (1953). It can also be used to show that any nonnegative, hollow, symmetric E can be embedded ‘nonmetrically’ in nk2 dimensions.
which two points coincide (and a distance is zero). It is shown by de Leeuw (1984) that at a local minimum of STRESS, pairs of points with positive dissimilarities cannot coincide.
5.2 Least Squares on the Squared Distances A second loss function, which has been used a great deal, is SSTRESS, defined by
5. Loss Functions
∆
5.1 Least Squares on the Distances The most straightforward loss function to measure fit between dissimilarities and distances is STRESS, defined by ∆
n
n
STRESS(X ) l wij(δijkdij(X ))#. i=" j="
(1)
Obviously this formulation applies to metric scaling only. In the case of nonmetric scaling, the major breakthrough in a proper mathematical formulation of the problem was Kruskal (1964). For this case, STRESS is defined as, ∆
STRESS(X, Dp ) l
n
n
SSTRESS(X ) l wij(δij# kd ij# (X ))#. i=" j="
ni= nj= wij(dV ijkdij(X ))# " " ni= nj= wij(dij(X )kd` ij(X ))# " "
(2)
and this function is minimized over both X and D< , where D< satisfies the constraints imposed by the data. In nonmetric MDS the D< are called disparities, and are required to be monotonic with the dissimilarities. Finding the optimal D< is an ‘isotonic regression problem.’ In the case of distance completion problems (with or without measurement error), the d# ij must be equal to the observed distances if these are observed, and they are otherwise free. One particular property of the STRESS loss function is that it is not differentiable for configurations in
Clearly, this loss function is a (fourth-order) multivariate polynomial in the coordinates. There are no problems with smoothness, but often a large number of local optima results. Of course a nonmetric version of the SSTRESS problem can be confronted, using the same type of approach used for STRESS.
5.3 Least Squares on the Inner Products The existence theorem discussed above suggests a third way to measure loss. Now the function is known as STRAIN, and it is defined, in matrix notation, as ∆
STRAIN(X ) l troJ(∆(#)kD(#)(X ))J(∆(#)kD(#)(X ))q (4) where D(#)(X ) and ∆(#) are the matrices of squared distances and dissimilarities, and where J is the centering operator. Since JD(#)(X )J lk2XXh this means that k" J∆(#)J is approximated by a positive # semi-definite matrix of rank r, which is a standard eigenvalue–eigenvector computation. Again, nonmetric versions of minimizing STRAIN are straightforward to formulate (although less straightforward to implement).
Table 3 Transformed journal reference data 0.00 2.93 4.77 1.89 3.33 2.78 0.77 1.02 1.35 3.79
13516
2.93 0.00 2.28 3.32 1.25 2.61 2.39 0.53 1.41 4.24
4.77 2.28 0.00 3.87 2.39 1.83 3.13 1.22 3.03 2.50
1.89 3.32 3.87 0.00 5.62 4.77 1.72 1.11 1.41 4.50
3.33 1.25 2.39 5.62 0.00 2.44 3.89 0.45 2.71 3.67
(3)
2.78 2.61 1.83 4.77 2.44 0.00 2.46 1.01 2.90 2.27
0.77 2.39 3.13 1.72 3.89 2.46 0.00 0.41 0.92 2.68
1.02 0.53 1.22 1.11 0.45 1.01 0.41 0.00 0.76 1.42
1.35 1.41 3.03 1.41 2.71 2.90 0.92 0.76 0.00 2.23
3.79 4.24 2.50 4.50 3.67 2.27 2.68 1.42 2.23 0.00
Scaling: Multidimensional
6. Algorithms
jection problem on the set of configurations satisfying the constraints, which is usually easy to solve.
6.1 Stress The original algorithms (Kruskal 1964) for minimizing STRESS use gradient methods with elaborate stepsize procedure. In de Leeuw (1977) the ‘majorization method’ was introduced. It leads to a globally convergent algorithm with a linear convergence rate, which is not bothered by the nonexistence of derivatives at places where points coincide. The majorization method can be seen as a gradient method with a constant step-size, which uses convex analysis methods to prove convergence. More recently, faster linearly or superlinearly convergent methods have been tried successfully (Glunt et al. 1993, Kearsley et al. 1998). One of the key advantages of the majorization method is that it extends easily to restricted MDS problems (de Leeuw and Heiser 1980b). Each subproblem in the sequence is a least squares pro-
6.2 SSTRESS Algorithms for minimizing SSTRESS were developed initially by Takane et al. (1977). They applied cyclic coordinate descent, i.e., one coordinate was changed at the time, and cycles through the coordinates were alternated with isotonic regressions in the nonmetric case. More efficient alternating least squares algorithms were developed later by de Leeuw, Takane, and Browne (cf. Browne (1987)), and superlinear and quadratic methods were proposed by Glunt and Liu (1991) and Kearsley et al. (1998). 6.3 STRAIN Minimizing STRAIN was, and is, the preferred algorithm in metric MDS. It is also used as the starting
Figure 1 Metric analysis (STRAIN left, STRESS right)
Figure 2 Nonmetric analysis (transformation left, solution right)
13517
Scaling: Multidimensional point in iterative nonmetric algorithms. Recently, more general algorithms for minimizing STRAIN in nonmetric and distance completion scaling have been proposed by Trosset (1998a, 1998b).
7. Analysis of the Example 7.1 Initial Transformation In the journal reference example, suppose E( fij) l αiβj expokφ(dij(X ))q. In principle this model can be tested by contingency table techniques. Instead the model is used to transform the frequencies to estimated distances, yielding klog
pffgg ffgg $ φ(d (X )) ij ji
ii jj
ij
where fij4 l fijj". This transformed matrix is given in # Table 3.
7.2 Metric Analysis In the first analysis, suppose the numbers in Table 3 are approximate distances, i.e., suppose that φ is the identity. Then STRAIN is minimized, using metric MDS by calculating the dominant eigenvalues and corresponding eigenvectors of the doubly-centered squared distance matrix. This results in the following two-dimensional configurations. The second analysis iteratively minimizes metric STRESS, using the majorization algorithm. The solutions are given in Fig. 1. Both figures show the same grouping of journals, with Pka as an outlier, the journals central to the discipline, such as AJP, JExP, PB, and PR, in the middle, and more specialized journals generally in the periphery. For comparison purposes the STRESS of the first solution is 0.0687, that of the second solution is 0.0539. Finding the second solution takes about 30 iterations.
7.3 Nonmetric STRESS Analysis Next, nonmetric STRESS is minimized on the same data (using only their rank order). The solution is in Fig. 2. The left panel displays the transformation relating the data in Table 3 to the optimally transformed data, a monotone step function. Again, basically the same configuration of journals, with the same groupings emerges. The nonmetric solution has a (normalized) STRESS of 0.0195, and again finding it takes about 30 iterations of the majorization method. The optimal transformation does not seem to deviate systematically from linearity. 13518
8. Further Reading Until recently, the classical MDS reference was the little book by Kruskal and Wish (1978). It is clearly written, but very elementary. A more elaborate practical introduction is by Coxon (1982), which has a useful companion volume (Davies and Coxon 1982) with many of the classical MDS papers. Some additional early intermediate-level books, written from the psychometric point of view, are Davison (1983) and Young (1987). More recently, more modern and advanced books have appeared. The most complete treatment is no doubt Borg and Groenen (1997), while Cox (1994) is another good introduction especially aimed at statisticians.
Bibliography Alfakih A Y, Khandani A, Wolkowicz H 1998 Solving Euclidean distance matrix completion problems via semidefinite programming. Computational Optimization and Applications 12: 13–30 Arabie P, Carroll J D, DeSarbo W S 1987 Three-Way Scaling and Clustering. Sage, Newbury Park Blumenthal L M 1953 Distance Geometry. Oxford University Press, Oxford, UK Borg I, Groenen P 1997 Modern Multidimensional Scaling. Springer, New York Boyden A 1931 Precipitin tests as a basis for a comparitive phylogeny. Proceedings of the Society for Experimental Biology and Medicine 29: 955–7 Browne M 1987 The Young–Householder algorithm and the least squares multidimensional scaling of squared distances. Journal of Classification 4: 175–90 Coombs C H 1964 A Theory of Data. Wiley, New York Cox T F 1994 Multidimensional Scaling. Chapman and Hall, New York Coxon A P M 1982 The User’s Guide to Multidimensional Scaling: With Special Reference to the MDS(X) Library of Computer Programs. Heinemann, Exeter, NH Crippen G M 1977 A novel approach to calculation of conformation: Distance geometry. Journal of Computational Physics 24: 96–107 Crippen G M, Havel T F 1988 Distance Geometry and Molecular Conformation. Wiley, New York Davies P M, Coxon A P M 1982 Key Texts in Multidimensional Scaling. Heinemann, Exeter, NH Davison M L 1983 Multidimensional Scaling. Wiley, New York de Leeuw J 1977 Applications of convex analysis to multidimensional scaling. In: Barra J R, Brodeau F, Romier G, van Cutsem B (eds.) Recent Deelopments in Statistics: Proceedings of the European Meeting of Statisticians, Grenoble, 6–11 September, 1976. North Holland, Amsterdam, The Netherlands, pp. 133–45 de Leeuw J 1984 Differentiability of Kruskal’s stress at a local minimum. Psychometrika 49: 111–3 de Leeuw J, Heiser W 1980a Theory of multidimensional scaling. In: Krishnaiah P (ed.) Handbook of Statistics. North Holland, Amsterdam, The Netherlands, Vol. II de Leeuw J, Heiser W J 1980b Multidimensional scaling with restrictions on the configuration. In: Krishnaiah P (ed.)
Scandal: Political Multiariate Analysis. North Holland, Amsterdam, The Netherlands, Vol. v, pp. 501–22 Ekman G 1963 Direct method for multidimensional ratio scaling. Psychometrika 23: 33–41 Fisher R A 1922 The systematic location of genes by means of cross-over ratios. American Naturalist 56: 406–11 Glunt W, Hayden T L, Liu W M 1991 The embedding problem for predistance matrices. Bulletin of Mathematical Biology 53: 769–96 Glunt W, Hayden T, Rayden M 1993 Molecular conformations from distance matrices. Journal of Computational Chemistry 14: 114–20 Gower J C, Legendre P 1986 Metric and Euclidean properties of dissimilarity coefficients. Journal of Classification 3: 5–48 Haberman S J 1974 The Analysis of Frequency Data. University of Chicago Press, Chicago Kearsley A J, Tapia R A, Trosset M W 1998 The solution of the metric STRESS and SSTRESS problems in multidimensional scaling using Newton’s method. Computational Statistics 13: 369–96 Kruskal J B 1964 Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29: 1–27 Kruskal J B, Wish M 1978 Multidimensional Scaling. Sage, Beverly Hills, CA Luce R 1963 Detection and recognition. In: Luce R D, Bush R R, Galanter E (eds.) Handbook of Mathematical Psychology. Wiley, New York, Vol. 1 Rothkopf E Z 1957 A measure of stimulusa similarity and errors in some paired-associate learning tasks. Journal of Experimental Psychology 53: 94–101 Schoenberg I J 1935 Remarks on Maurice Fre! chet’s article Sur la de! finition axiomatique d’une classe d’espaces distancie! s vectoriellement applicable sur l’espace de Hilbert. Annals of Mathematics 724–32 Shepard R N 1957 Stimulus and response generalization: A stochastic model relating generalization to distance in psychological space. Psychometrika 22: 325–45 Shepard R N 1962 The analysis of proximities: Multidimensional scaling with an unknown distance function. Psychometrika 27: 125–40, 219–46 Takane Y, Young F W, de Leeuw J 1977 Nonmetric individual differences in multidimensional scaling: An alternating least squares method with optimal scaling features. Psychometrika 42: 7–67 Torgerson W S 1958 Theory and Methods of Scaling. Wiley, New York Trosset M W 1998a Applications of multidimensional scaling to molecular conformation. Computing Science and Statistics 29: 148–52 Trosset M W 1998b A new formulation of the nonmetric strain problem in multidimensional scaling. Journal of Classification 15: 15–35 Young F W 1987 Multidimensional Scaling: History, Theory, and Applications. Earlbaum, Hillsdale, NJ
J. de Leeuw
Scandal: Political The word ‘scandal’ is used primarily to describe a sequence of actions and events which involve certain kinds of transgressions and which, when they become
known to others, are regarded as sufficiently serious to elicit a response of disapproval or condemnation. A scandal is necessarily a public event in the sense that, while the actions which lie at the heart of the scandal may have been carried out secretly or covertly, a scandal can arise only if these actions become known to others, or are strongly believed by others to have occurred. This is one respect in which scandal differs from related phenomena such as corruption and bribery; a scandal can be based on the disclosure of corruption or bribery, but corruption and bribery can exist (and often do exist) without being known about by others, and hence without becoming a scandal.
1. The Concept of Scandal The concept of scandal is very old and the meaning has changed over time. In terms of its etymological origins, the word probably derives from the Indogermanic root skand-, meaning to spring or leap. Early Greek derivatives, such as the word skandalon, were used in a figurative way to signify a trap, an obstacle or a ‘cause of moral stumbling.’ The idea of a trap or an obstacle was an integral feature of the theological vision of the Old Testament. In the Septuagint (the Greek version of the Old Testament), the word skandalon was used to describe an obstacle, a stumbling block placed along the path of the believer, which could explain how a people linked to God might nevertheless begin to doubt Him and lose their way. The notion of a trap or obstacle became part of Judaism and early Christian thought, although it was gradually prised apart from the idea of a test of faith. With the development of the Latin word scandalum and its diffusion into Romance languages, the religious connotation was gradually attenuated and supplemented by other senses. The word ‘scandal’ first appeared in English in the sixteenth century; similar words appeared in other Romance languages at roughly the same time. The early uses of ‘scandal’ in the sixteenth and seventeenth centuries were, broadly speaking, of two main types. First, ‘scandal’ was used in a religious context to refer to the conduct of a person which brought discredit to religion, or to something which hindered religious faith or belief. Second, ‘scandal’ and its cognates were also used in more secular contexts to describe actions or utterances which were regarded as scurrilous or abusive, which damaged an individual’s reputation, which were grossly discreditable, and\or which offended moral sentiments or the sense of decency. It is these later, more secular senses which underlie the most common modern uses of the word ‘scandal.’ Although the word continues to have some use as a specialized religious term, ‘scandal’ is used mainly to refer to a broader form of moral transgression which is no longer linked specifically to religious codes. More precisely, ‘scandal’ could be defined as actions or events which have the following characteristics: their 13519
Scandal: Political occurrence involves the transgression of certain values, norms or moral codes; their occurrence involves a degree of secrecy or concealment, but they are known or strongly believed to exist by individuals other than those directly involved; some individuals disapprove of the actions or events and may be offended by the transgression; some express their disapproval by publicly denouncing the actions or events; and the disclosure and condemnation of the actions or events may damage the reputation of the individuals responsible for them. While scandal necessarily involves some form of transgression, there is a great deal of cultural and historical variability in the kinds of values, norms, and moral codes which are relevant here. What counts as scandalous activity in one context, e.g., extramarital affairs among members of the political elite, may be regarded as acceptable (even normal) elsewhere. A particular scandal may also involve different types of transgression. A scandal may initially be based on the transgression of a moral code (e.g., concerning sexual relations), but as the scandal develops, the focus of attention may shift to a series of ‘second-order transgressions’ which stem from actions aimed at concealing the original offence. The attempt to cover up a transgression—a process that may involve deception, obstruction, false denials, and straightforward lies—may become more important than the original transgression itself, giving rise to an intensifying cycle of claim and counterclaim that dwarfs the original offence. Scandals can occur in different settings and milieux, from local communities to the arenas of national and even international politics. When scandals occur in settings which are more extended than local communities, they generally involve mediated forms of communication such as newspapers, magazines, and, increasingly, television. The media play a crucial role in making public the actions or events which lie at the heart of scandal, either by reporting allegations or information obtained by others (such as the police or the courts) or by carrying out their own investigations. The media also become a principal forum in which disapproval of the actions or events is expressed. Mediated scandals are not simply scandals which are reported by the media; rather, they are ‘mediated events’ which are partly constituted by the activities of media organizations.
2. The Nature of Political Scandal Scandals are common in many spheres of social life; not all scandals are political scandals. So what are the distinctive features of political scandals? One seemingly straightforward way of answering this question is to say that a political scandal is any scandal that involves a political leader or figure. But this is not a particularly helpful or illuminating answer. For an 13520
individual is a political leader or figure by virtue of a broader set of social relations and institutions which endow him or her with power. So if we wish to understand the nature of political scandal, we cannot focus on the individual alone. Another way of answering this question would be to focus not on the status of the individuals involved, but on the nature of the transgression. This is the approach taken by Markovits and Silverstein, two political scientists who have written perceptively about political scandal. According to Markovits and Silverstein (1988), the defining feature of political scandal is that it involves a ‘violation of due process.’ By ‘due process’ they mean the legally binding rules and procedures which govern the exercise of political power. Political scandals are scandals in which these rules and procedures are violated by those who exercise political power and who seek to increase their power at the expense of due process. Since due process is fully institutionalized only in the liberal democratic state, it follows, argue Markovits and Silverstein, that political scandals can occur only in liberal democracies. One strength of Markovits and Silverstein’s account is that it analyzes political scandal in relation to some of the most important institutional features of modern states. But the main shortcoming of this account is that it provides a rather narrow view of political scandal. It treats one dynamic—the pursuit of power at the expense of process—as the defining feature of political scandal. Hence any scandal that does not involve this particular dynamic is ipso facto nonpolitical. This means that a whole range of scandals, such as those based on sexual transgressions, would be ruled out as nonpolitical, even though they may involve senior political figures and may have farreaching political consequences. Markovits and Silverstein’s claim that political scandal can occur only in liberal democracies should also be viewed with some caution. It is undoubtedly the case that liberal democracies are particularly prone to political scandal, but this is due to a number of specific factors (such as the highly competitive nature of liberal democratic politics and the relative autonomy of the press) and it does not imply that political scandal is unique to this type of political organization. Political scandals can occur (and have occurred) in other types of political system, from the absolutist and constitutional monarchies of early modern Europe to the various forms of authoritarian regime which have existed in the twentieth century. But political scandals in these other types of political system are more likely to remain localized scandals and are less likely to spread beyond the relatively closed worlds of the political elite. An alternative way of conceptualizing political scandals is to regard them as scandals involving individuals or actions which are situated within a political field (Thompson 2000). It is the political field that constitutes the scandal as political; it provides the
Scandal: Political context for the scandal and shapes its pattern of development. A field is a structured space of social positions whose properties are defined primarily by the relations between these positions and the resources attached to them. The political field can be defined as the field of action and interaction which bears on the acquisition and exercise of political power. Political scandals are scandals which occur within the political field and which have an impact on relations within this field. They may involve the violation of rules and procedures governing the exercise of political power, but they do not have to involve this; other kinds of transgression can also constitute political scandals. We can distinguish between three main types of political scandal, depending on the kinds of norms or codes which are transgressed. Sex scandals involve the transgression of norms or codes governing the conduct of sexual relations. In some contexts, sexual transgressions carry a significant social stigma and their disclosure may elicit varying degrees of disapproval by others. Financial scandals involve the infringement of rules governing the acquisition and allocation of economic resources; these include scandals involving bribery, kickbacks and other forms of corruption as well as scandals stemming from irregularities in the raising and deployment of campaign funds. Power scandals are based on the disclosure of activities which infringe the rules governing the acquisition or exercise of political power. They involve the unveiling of hidden forms of power, and actual or alleged abuses of power, which had hitherto been concealed beneath the public settings in which power is displayed and the publicly recognized procedures through which it is exercised.
3. The Rise of Political Scandal The origins of political scandal as a mediated event can be traced back to the pamphlet culture of the seventeenth and eighteenth centuries. During the period of the English Civil War, for instance, there was a proliferation of anti-Royalist pamphlets and newsbooks which were condemned as heretical, blasphemous, scurrilous, and ‘scandalous’ in character. Similarly, in France, a distinctive genre of subversive political literature had emerged by the early eighteenth century, comprising the libelles and the chroniques scandaleuses, which purported to recount the private lives of kings and courtiers and presented them in an unflattering light. However, in the late eighteenth and nineteenth centuries, the use of ‘scandal’ in relation to mediated forms of communication began to change, as the term was gradually prised apart from its close association with blasphemy and sedition and increasingly applied to a range of phenomena which displayed the characteristics we now associate with scandal. By the late nineteenth century, mediated scandal had become a relatively common feature of the
political landscape in countries such as Britain and the USA. In Britain there were a number of major scandals, many involving sexual transgressions of various kinds, which destroyed (or threatened to destroy) the careers of key political figures such as Sir Charles Dilke (a rising star of the Liberal Party whose career was irrevocably damaged by the events surrounding a divorce action in which he was named as co-respondent) and Charles Parnell (the charismatic leader of the Irish parliamentary party whose career was destroyed by revelations concerning his affair with Mrs. Katharine O’Shea). There were numerous political scandals in nineteenth-century America too, some involving actual or alleged sexual transgressions (such as the scandal surrounding Grover Cleveland who, it was said, had fathered an illegitimate child) and many involving corruption at municipal, state and federal levels of government. America in the Gilded Age witnessed a flourishing of financial scandals in the political field, and the period of Grant’s Presidency (1869–77) is regarded by many as one of the most corrupt in American history. While the nineteenth century was the birthplace of political scandal as a mediated event, the twentieth century was to become its true home. Once this distinctive type of event had been invented, it would become a recognizable genre that some would seek actively to produce while others would strive, with varying degrees of success, to avoid. The character and frequency of political scandals varied considerably from one national context to another, and depended on a range of specific social and political circumstances. In Britain and the USA, there were significant political scandals throughout the early decades of the twentieth century, but political scandals have become particularly prevalent in the period since the early 1960s. In Britain, the Profumo scandal of 1963 was a watershed. This was a classic sex scandal involving a senior government minister (John Profumo, Secretary of State for War) and an attractive young woman (Christine Keeler), but it also involved issues of national security and a series of second-order transgressions which proved fatal for Profumo’s career. In the USA, the decisive political scandal of the twentieth century was undoubtedly Watergate, a power scandal in which Nixon was eventually forced to resign as President in the face of his imminent impeachment. Many countries have developed their own distinctive political cultures of scandal which have been shaped by, among other things, their respective traditions of scandal, the activities of journalists, media organizations, and other agents in the political field, the deployment of new technologies of communication, and the changing political climate of the time. Political scandal has become a potent weapon in struggles between rival candidates and parties in the political field. As fundamental disagreements over matters of principle have become less pronounced, questions of character and trust have become incr13521
Scandal: Political easingly central to political debate and scandal has assumed increasing significance as a ‘credibility test.’ In this context, the occurrence of scandal tends to have a cumulative effect: scandal breeds scandal, because each scandal exposes character failings and further sharpens the focus on the credibility and trustworthiness of political leaders. This is the context in which President Clinton found that his political career was nearly destroyed by scandal on more than one occasion. Like many presidential hopefuls in the past, Clinton campaigned on the promise to clean up politics after the sleaze of the Reagan administration. But he soon found that members of his own administration—and, indeed, that he and his wife—were being investigated on grounds of possible financial wrongdoing. He also found that allegations and revelations concerning his private life would become highly public issues, threatening to derail his campaign is 1992 (with the Gennifer Flowers affair) and culminating in his impeachment and trial by the Senate following the disclosure of his affair with Monica Lewinsky. What led to Clinton’s impeachment was not the disclosure of the affair as such, but rather a series of second-order transgressions committed in relation to a sexual harassment case instituted by Paula Jones, in the context of which Clinton gave testimony under oath denying that he had had sexual relations with Monica Lewinsky, thereby laying himself open to the charge of perjury, among other things. Clinton’s trial in the Senate resulted in his acquittal, but his reputation was undoubtedly damaged by the scandal which overshadowed the second term of his presidency. See also: Elites: Sociological Aspects; Mass Media: Introduction and Schools of Thought; Political Culture; Political Discourse; Political Trials; Public Opinion: Political Aspects
Bibliography Allen L et al. 1990 Political Scandals and Causes CeleZ bres Since 1945: An International Reference Compendium n.d. Longman, Harlow, Essex, UK, Chicago, IL, USA: Published in the USA and Canada by St. James Press Garment S 1992 Scandal: The Culture of Mistrust in American Politics. Doubleday, New York King A 1986 Sex, money, and power. In: Hodder-Williams R, Ceaser J (eds.) Politics in Britain and the United States: Comparatie Perspecties. Duke University Press, Durham, NC pp. 173–222 Markovits A S, Silverstein M (eds.) 1988 The Politics of Scandal: Power and Process in Liberal Democracies. Holmes and Meier, New York Allen L et al. 1990 Political Scandals and Causes CeleZ bres Since 1945: An International Reference Compendium n.d. Longman, Harlow, Essex, UK, Chicago, IL, USA: Published in the USA and Canada by St. James Press
13522
Schudson M 1992 Watergate in American Memory: How We Remember, Forget, and Reconstruct the Past. Basic Books, New York Thompson J B 2000 Political Scandal: Power and Visibility in the Media Age. Polity, Cambridge, UK
J. B. Thompson
Schemas, Frames, and Scripts in Cognitive Psychology The terms ‘schema,’ ‘frame,’ and ‘script’ are all used to refer to a generic mental representation of a concept, event, or activity. For example, people have a generic representation of a visit to the dentist office that includes a waiting room, an examining room, dental equipment, and a standard sequence of events in one’s experience at the dentist. Differences in how each of the three terms, schema, frame, and script, are used reflect the various disciplines that have contributed to an understanding of how such generic representations are used to facilitate ongoing information processing. This article will provide a brief family history of the intellectual ancestry of modern views of generic knowledge structures as well as an explanation of how such knowledge influences mental processes.
1. Early Deelopment of the ‘Schema’ Concept The term ‘schema’ was already in prominent use in the writings of psychologists and neurologists by the early part of the twentieth century (e.g., Bartlett 1932, Head 1920, see also Piaget’s Theory of Child Deelopment). Though the idea that our behavior is guided by schemata (the traditional plural form of schema) was fairly widespread, the use of the term ‘schema’ was seldom well defined. Nevertheless, the various uses of the term were reminiscent of its use by the eighteenth century philosopher Immanuel Kant. In his influential treatise Critique of Pure Reason, first published in 1781, Kant wrestled with the question of whether all knowledge is dependent on sensory experience or whether there exists a priori knowledge, which presumably would consist of elementary concepts, such as space and time, necessary for making sense of the world. Kant believed that we depend both on our knowledge gained from sensory experience and a priori knowledge, and that we use a generic form of knowledge, schemata, to mediate between sensorybased and a priori knowledge. Unlike today, when the idea of a schema is most often invoked for large-scale structured knowledge, Kant used the term in reference to more basic concepts. In Critique of Pure Reason he wrote: ‘In truth, it is not images of objects, but schemata, which lie at the foundation of our pure sensuous conceptions. No
Schemas, Frames, and Scripts in Cognitie Psychology image could ever be adequate to our conception of triangles in general … The schema of a triangle can exist nowhere else than in thought and it indicates a rule of the synthesis of the imagination in regard to pure figures in space.’ The key idea captured by Kant was that people need mental representations that are typical of a class of objects or events so that we can respond to the core similarities across different stimuli of the same class (see Natural Concepts, Psychology of; Mental Representations, Psychology of ). The basic notion that our experience of the world is filtered through generic representations was also at the heart of Sir Frederic Bartlett’s use of the term ‘schema.’ Bartlett’s studies of memory, described in his classic book published in 1932, presaged the rise of schematheoretic views of memory and text comprehension in the 1970s and 1980s. In his most influential series of studies, Bartlett asked people to read and retell folktales from unfamiliar cultures. Because they lacked the background knowledge that rendered the folktales coherent from the perspective of the culture of origin, the participants in Bartlett’s studies tended to conventionalize the stories by adding connectives and explanatory concepts that fit the reader’s own world view. The changes were often quite radical, particularly when the story was recalled after a substantial delay. For example, in some cases the reader gave the story a rather conventional moral consistent with more familiar folktales. Bartlett interpreted these data as indicating that: ‘Remembering is not the re-excitation of innumerable fixed, lifeless, and fragmentary traces’ (p. 213). Instead, he argued, memory for a particular stimulus is reconstructed based on both the stimulus itself and on a relevant schema. By schema, Bartlett meant a ‘whole active mass of organised past reactions or experience.’ Thus, if a person’s general knowledge of folktales includes the idea that such tales end with an explicit moral, then in reconstructing an unfamiliar folktale the person is likely to elaborate the original stimulus by adding a moral that appears relevant (see Reconstructie Memory, Psychology of ). Although Bartlett made a promising start on understanding memory for complex naturalistic stimuli, his work was largely ignored for the next 40 years, particularly in the United States where researchers focused on much more simplified learning paradigms from a behavioral perspective. Then, interest in schemata returned as researchers in the newly developed fields of cognitive psychology and artificial intelligence attempted to study and to simulate discourse comprehension and memory (see Knowledge Actiation in Text Comprehension and Problem Soling, Psychology of ).
2. The Rediscoery of Schemata Interest in schemata as explanatory constructs returned to psychology as a number of memory re-
searchers demonstrated that background knowledge can strongly influence the recall and recognition of text. For example, consider the following passage: ‘The procedure is actually quite simple. First, you arrange items into different groups. Of course, one pile may be sufficient, depending on how much there is to do. If you have to go somewhere else due to lack of facilities that is the next step; otherwise, you are pretty well set.’ This excerpt, from a passage used by Bransford and Johnson (1972), is very difficult to understand and recall unless the reader has an organizing schema for the information. So, readers who are supplied with the appropriate title, Washing Clothes, are able to recall the passage much more effectively than readers who are not given the title. Although the early 1970s witnessed a great number of clear demonstrations that previous knowledge shapes comprehension and memory, progress on specifying exactly how background knowledge becomes integrated with new information continued to be hampered by the vagueness of the schema concept. Fortunately, at this same time, researchers in artificial intelligence (AI) identified the issue of background knowledge as central to simulations of text processing. Owing to the notorious inability of computers to deal with vague notions, the AI researchers had to be explicit in their accounts of how background knowledge is employed in comprehension. A remarkable cross-fertilization occurred between psychology and AI that encouraged the development and testing of hypotheses about the representation and use of schematic knowledge. 2.1 Schema Theory in AI Research: Frames and Scripts Although many researchers in the nascent state of AI research in 1970s were unconcerned with potential connections to psychology, several pioneers in AI research noted the mutual relevance of the two fields early on. In particular, Marvin Minsky and Roger Schank developed formalizations of schema-theoretic approaches that were both affected by, and greatly affected, research in cognitive psychology. One of the core concepts from AI that was quickly adopted by cognitive psychologists was the notion of a frame. Minsky (1975) proposed that our experiences of familiar events and situations are represented in a generic way as frames that contain slots. Each slot corresponds to some expected element of the situation. One of the ways that Minsky illustrated what he meant by a frame was in reference to a job application form. A frame, like an application, has certain general domains of information being sought and leaves blanks for specific information to be filled in. As another example, someone who walks into the office of a university professor could activate a frame for a generic professor’s office that would help orient the person to the situation and allow for easy identification 13523
Schemas, Frames, and Scripts in Cognitie Psychology
Figure 1 A child’s birthday party script. The script consists of a set of distinct scenes. The slots under each scene can be filled with default information, such as ‘hot dogs’ for party food. Note that the cake ceremony could be broken down further into scenes such as candle lighting, blowing out candles, etc. Thus, one script or schema can be embedded inside another
of its elements. So, a person would expect to see a desk and chair, a bookshelf with lots of books, and, these days at least, a computer. An important part of Minsky’s frame idea is that the slots of a frame are filled with default values if no specific information from the context is provided (or noticed). Thus, even if the person didn’t actually see a computer in the professor’s office, he or she might later believe one was there because computer would be a default value that filled an equipment slot in the office frame (see Brewer and Treyens 1981 for relevant empirical evidence). The idea that generic knowledge structures contain slots with default values was also a central part of Roger Schank’s work on scripts. Scripts are generic representations of common social situations such as going to the doctor, having dinner at a restaurant, or attending a party (Fig. 1). Teamed with a social psychologist, Robert Abelson, Schank developed computer simulations of understanding stories involving such social situations. The motivation for the development of the script concept was Schank’s observation that people must be able to make a large number of inferences quite automatically in order to comprehend discourse. This can be true even when the discourse consists of a relatively simple pair of sentences: Q: So, would you like to get some dinner? A: Oh, I just had some pizza a bit ago. Even though the question is not explicitly answered, a reader of these two sentences readily understands that the dinner invitation was declined. Note that the interpretation of this exchange is more elaborate if the reader knows that it occurred between a young man and a young woman who met at some social event and 13524
had a lengthy conversation. In this context, a script for asking someone on a date may be invoked and the answer is interpreted as a clear ‘brush off.’ Thus, a common feature of the AI work on frames and scripts in simulations of natural language understanding was that inferences play a key role. Furthermore, these researchers provided formal descriptions of generic knowledge structures that could be used to generate the inferences necessary to comprehension. Both AI researchers and cognitive psychologists of the time recognized that such knowledge structures captured the ideas implicit in Bartlett’s more vague notion of a schema. The stage was set to determine whether frames and scripts were useful descriptors for the knowledge that people actually use when processing information (see Inferences in Discourse, Psychology of; Memory for Text).
2.2 Empirical Support for Schema Theory If people’s comprehension and memory processes are based on retrieving generic knowledge structures and, as necessary, filling slots of those knowledge structures with default values, then several predictions follow. First, people with similar experiences of common situations and events will have similar schemata represented in memory. Second, as people retrieve such schemata during processing, they will make inferences that go beyond directly stated information. Third, in many cases, particularly when processing is guided by a script-like structure, people will have expectations about not only the content of a situation but also the sequence of events. Cognitive psychologists obtained empirical support for each of these predictions. For example, Bower et al. (1979) conducted a series of experiments on people’s generation of, and use of, scripts for common events like going to a nice restaurant, attending a lecture, or visiting the doctor. They found high agreement on the actions that take place in each situation and on the ordering. Moreover, there was good agreement on the major clusters of actions, or scenes, within each script. Consistent with the predictions of schema theory, when people read stories based on a script, their recall of the story tended to include actions from the script that were not actually mentioned in the story. Bower et al. (1979) also used some passages in which an action from a script was explicitly mentioned but out of its usual order in the script. In their recall of such a passage, people tended to recall the action of interest in its canonical position rather than in its actual position in the passage. Cognitive psychologists interested in discourse processing further extended the idea of a schema capturing a sequence of events by proposing that during story comprehension people make use of a story grammar. A story grammar is a schema that specifies the abstract organization of the parts of a story. So, at the highest
Schemas, Frames, and Scripts in Cognitie Psychology level a story consists of a setting and one or more episodes. An episode consists of an event that presents a protagonist with an obstacle to be overcome. The protagonist then acts on a goal formulated to overcome the obstacle. That action then elicits reactions that may or may not accomplish the goal. Using such a scheme, each idea in a story can be mapped onto a hierarchical structure that represents the relationship among the ideas (e.g., Mandler 1984). We can use story grammars to predict people’s sentence reading times and recall because the higher an idea is in the hierarchy, the longer the reading time and the better the memory for that idea (see Narratie Comprehension, Psychology of ). Thus, the body of research that cognitive psychologists generated in the 1970s and early 1980s supported the notion that data structures like frames and scripts held promise for capturing interesting aspects of human information processing. Nevertheless, it soon became clear that modifications of the whole notion of a schema were necessary in order to account for people’s flexibility in comprehension and memory processing.
3. Schema Theory Reised: The New Synthesis Certainly, information in a text or the context of an actual event could serve as a cue to trigger a relevant script. But what about situations that are more novel and no existing script is triggered? At best, human information processing can only sometimes be guided by large-scale schematic structures. Traditionally, schema theorists have tended to view schema application as distinct from the processes involved in building up schemata in the first place. So, people might use a relevant schema whenever possible, but when no relevant schema is triggered another set of processes are initiated that are responsible for constructing new schemata. More recently, however, an alternative view of schemata has been developed, again with contributions both from AI and cognitive psychology. This newer approach places schematic processing at one end of a continuum in which processing varies from data driven to conceptually driven. Data-driven (also known as bottom-up) processing means that the analysis of a stimulus is not appreciably influenced by prior knowledge. In the case of text comprehension, this would mean that the mental representation of a text would stick closely to information that was directly stated. Conceptually driven (also known as top-down) processing means that the analysis of a stimulus is guided by pre-existing knowledge, as, for example, described by traditional schema theory. In the case of text comprehension, this would mean that the reader makes many elaborative inferences that go beyond the information explicitly stated in the text. In many cases, processing will be an intermediate point
on the continuum so that there is a modest conceptually driven influence by pre-existing knowledge structures. This more recent view of schemata has been called schema assembly theory. The central idea of this view is that a schema, in the sense of a framework guiding our interpretation of a situation or event, is always constructed on the spot based on the current context. Situations vary in how much background knowledge is readily available, so sometimes a great deal of related information is accessed that can guide processing and at other times very little of such background knowledge is available. In these cases we process new information in a data-driven fashion. This new view of schemata was synthesized from work in AI and cognitive psychology as researchers realized the need to incorporate a more dynamic view of the memory system into models of knowledgebased processing. For example, AI programs based on scripts performed poorly when confronted with the kinds of script variations and exceptions seen in the ‘real world.’ Accordingly, Schank (1982) began to develop systems in which appropriate script-like structures were built to fit the specific context rather than just retrieved as a precomplied structure. Therefore, different possible restaurant subscenes could be pulled together in an ad hoc fashion based on both background knowledge and current input. In short, Schank’s simulations became less conceptually driven in their operation. Likewise, cognitive psychologists, confronted with empirical data that was problematic for traditional schema theories, have moved to a more dynamic view. There were two main sets of empirical findings that motivated cognitive psychologists to make these changes. First, it became clear that people’s processing of discourse was not as conceptually driven as classic schema theories proposed. In particular, making elaborative inferences that go beyond information directly stated in the text is more restricted than originally believed. People may be quite elaborative when reconstructing a memory that is filled with gaps, but when tested for inference making concurrent with sentence processing, people’s inference generation is modest. For example, according to the classic view of schemata, a passage about someone digging a hole to bury a box of jewels should elicit an inference about the instrument used in the digging. The default value for such an instrument would probably be a shovel. Yet, when people are tested for whether the concept of shovel is primed in memory by such a passage (for example, by examining whether the passage speeds up naming time for the word ‘shovel’), the data suggest that people do not spontaneously infer instruments for actions. Some kinds of inferences are made, such as connections between goals and actions or filling in some general terms with more concrete instances, but inference making is clearly more restricted than researchers believed in the early days after the re13525
Schemas, Frames, and Scripts in Cognitie Psychology discovery of schemata (e.g., Graesser and Bower 1990, Graesser et al. 1997). Second, it is not always the case that what we remember from some episode or from a text is based largely on pre-existing knowledge. As Walter Kintsch has shown, people appear to form not one, but three memories as they read a text: a short-lived representation of the exact wording, a more durable representation that captures a paraphrase of the text, and a situation model that represents what was described by the text, rather than the text itself (e.g., Kintsch 1988). It is this third type of representation that is strongly influenced by available schemata. However, whether or not a situation model is formed during comprehension depends on not only the availability of relevant chunks of knowledge, but also the goals of the reader (Memory for Meaning and Surface Memory; Situation Model: Psychological ). Thus, background knowledge can affect processing of new information in the way described by schema theories, but our processing represents a more careful balance of conceptually-driven and data-driven processes than claimed by traditional schema theories. Moreover, the relative contribution of background knowledge to understanding varies with the context, especially the type of text and the goal of the reader. To accommodate more flexibility in the use of the knowledge base, one important trend in research on schemata has been to incorporate schemata into connectionist models. In connectionist models the representation of a concept is distributed over a network of units. The core idea is that concepts, which are really repeatable patterns of activity among a set of units, are associated with varying strengths. Some concepts are so often associated with other particular concepts that activating one automatically entails activating another. However, in most cases the associations are somewhat weaker, so that activating one concept may make another more accessible if needed. Such a system can function in a more top-down manner or a more bottom-up manner depending on the circumstances. To rephrase this idea in terms of traditional schema theoretic concepts, some slots are available but they need not be filled with particular default values if the context does not require it. Thus, the particular configuration of the schema that is used depends to a large extent on the processing situation (see Cognition, Distributed ). In closing, it is important to note that the success that researchers in cognitive psychology and AI have had in delineating the content of schemata and in determining when such background knowledge influences ongoing processing has had an important influence on many other ideas in psychology. The schema concept has become a critical one in clinical psychology, for example in cognitive theories of depression, and in social psychology, for example in understanding the influence of stereotypes on person perception (e.g., Hilton and von Hipple 1996). It 13526
remains to be seen whether the concept of a schema becomes as dynamic in these areas of study as it has in cognitive psychology. See also: Connectionist Models of Concept Learning; Knowledge Activation in Text Comprehension and Problem Solving, Psychology of; Knowledge (Explicit and Implicit): Philosophical Aspects; Knowledge Representation; Knowledge Spaces; Mental Representations, Psychology of; Metaphor and its Role in Social Thought: History of the Concept; Reference and Representation: Philosophical Aspects; Schemas, Social Psychology of
Bibliography Bartlett F C 1932 Remembering. Cambridge University Press, Cambridge, UK Bower G H, Black J B, Turner T J 1979 Scripts in memory for text. Cognitie Psychology 11: 177–220 Bransford J D, Johnson M K 1972 Contextual prerequisites for understanding: A constructive versus interpretive approach. Journal of Verbal Learning and Verbal Behaior 11: 717–26 Brewer W F, Treyens J C 1981 Role of schemata in memory for places. Cognitie Psychology 13: 207–30 Graesser A C, Bower G H (eds.) 1990 Inferences and Text Comprehension: The Psychology of Learning and Motiation. Academic Press, San Diego, CA, Vol. 25 Graesser A C, Millis K K, Zwaan R A 1997 Discourse comprehension. Annual Reiew of Psychology 48: 163–89 Head H 1920 Studies in Neurology. Oxford University Press, London Hilton J L, von Hipple W 1996 Stereotypes. Annual Reiew of Psychology 47: 237–71 Kant I 1958 Critique of Pure Reason, trans. by N. K. Smith. Modern Library, New York. Originally published in 1781 Kintsch W 1988 The role of knowledge in discourse comprehension: A construction-integration model. Psychological Reiew 95: 163–82 Mandler J M 1984 Stories, Scripts, and Scenes: Aspects of Schema Theory. Erlbaum Associates, Hillsdale, NJ Minsky M 1975 A framework for representing knowledge. In: Winston P H (ed.) The Psychology of Computer Vision. McGraw-Hill, New York Schank R C 1982 Dynamic Memory: A Theory of Reminding and Learning in Computers and People. Cambridge University Press, New York Schank R C, Abelson R P 1977 Scripts, Plans, Goals, and Understanding. Erlbaum Associates, Hillsdale, NJ Whitney P, Budd D, Bramucci R S, Crane R S 1995 On babies, bath water, and schemata: A reconsideration of top-down processes in comprehension. Discourse Process 20: 135–66
P. Whitney
Schemas, Social Psychology of Schemas (or schemata) are generic knowledge structures that summarize past experiences and provide a framework for the acquisition, interpretation, and
Schemas, Social Psychology of retrieval of new information. For example, one might have a clown schema based on past encounters with, and previously learned knowledge about, clowns. When one encounters a new clown, one’s clown schema may cause one to notice its painted face, expect polka-dotted pantaloons, interpret its actions as goofy, and recall the presence of a unicycle. Usually, such operations are functional, leading to more rapid, accurate, and detailed information processing. However, such operations can lead to biases or errors when they produce inaccurate information (e.g., when one mistakenly recalls a clown as having a bulbous red nose).
1. History Many of the fundamental precepts underlying the schema concept were anticipated by the Gestalt movement in European psychology. Bartlett (1932) incorporated these ideas into his concept of schema. He posited that past memories are stored as larger, organized structures rather than as individual elements, and that newly perceived information is accommodated into these ‘masses’ of old information. He suggested that this accommodation process is accomplished primarily through unconscious inference and faulty memory. Consequently, his theory focused more on the dysfunctional, reconstructive, and inaccurate nature of schemas. Because of his emphasis on subjective experience and the unconscious, his views and research methods departed from the behaviorist leanings of American psychology, and were largely ignored until the cognitive revolution in the 1960s and 1970s. In the mid-1970s, schema theory was refined by cognitive psychologists (see Brewer and Nakamura 1984 and Hastie 1981). While incorporating Bartlett’s basic premises, these modern approaches also acknowledged the functional benefits of schemas in terms of increased cognitive efficiency and accuracy. Furthermore, these modern approaches extended the concept of schema by incorporating it within contemporary cognitive theory. For example, schemas are now construed as (a) selectively activated, just like any other concept in memory, (b) nested within each other, with each defined partly through reference to related schemas, (c) providing ‘default values’ for embedded informational features, and (d) having active, testable memory functions. These ideas stimulated considerable work on schematic processing within cognitive psychology.
2. Schematic Processing For a schema to affect processing it must first be activated. Because of the unitary and organized nature of schemas, this is generally assumed to be an all-ornothing process, in that the entire schema (as opposed
to bits and pieces) either comes to mind completely or not at all. For example, one could not activate a chair schema without becoming aware of both the typical appearance of chairs and their normal seating function. Research has identified a number of variables that influence when particular schemas are activated by new information, including fit, context and priming. In terms of fit, for example, a chair with typical features like four legs, armrests, and a back is more likely to activate the chair schema than is a threelegged stool. Context can also influence activation, so that an ice chest is more likely to activate a chair schema at a campsite than on a store shelf. Finally, schemas can be primed in memory through frequent or recent experiences that bring them to mind. For example, reading about chairs in this paragraph could increase the likelihood that you would activate a chair schema in response to a tree stump. Once activated, schemas appear to have complex effects on attention. Overall, research suggests that schemas direct attention toward schema-relevant, rather than schema-irrelevant, information. However, it makes some difference whether the schema-relevant information is schema congruent or schema incongruent. Schema-incongruent information captures the most attention, presumably because it violates expectations. Schema-congruent information is attended to while an appropriate schema is in the process of being activated or instantiated, but ignored once this activation is accomplished. For example, early in the process a seat might be noticed because it helps to identify an object as a chair, but once a chair schema is activated, the seat would be ignored as obvious. Schemas have also been shown to have powerful effects on the interpretation (or encoding) of new information, providing a framework that shapes expectancies and judgments. Research suggests that this influence is especially pronounced when the new information is ambiguous, vague or incomplete. For example, the ambiguous word ‘chair’ would be interpreted differently if you had a department head schema activated than if you had a furniture schema activated. Finally, research indicates that schemas affect the retrieval of information from memory. Memory effects are often viewed as reflecting the organizational properties of schemas. Because schemas are large, organized clusters of linked information, they provide multiple routes of access for retrieving individual items of information. Such retrieval routes have been shown to facilitate memory in a number of ways. For example, experts have particularly complex and wellorganized schemas and consequently exhibit better recall for domain-relevant information. Thus, as chess expertise increases, so does memory for board arrangements from actual games (though not for random arrangements). Enhanced recall for schema-related information can reflect either retrieval or guessing mechanisms. For 13527
Schemas, Social Psychology of example, if a newspaper article activates a prisoner execution schema, the link between such executions and electric chairs could help one to recall that an electric chair was mentioned. On the other hand, the schema may instead facilitate an educated guess. As noted previously, schemas have been found to provide defaults for filling in missing information. Thus an electric chair might be ‘remembered’ even if it were not mentioned in the story. Such guesses produce biases and errors when the inferred information is incorrect.
4. Social Psychological Research
3. Criticisms of the Schema Concept
4.1
A number of criticisms were leveled at the schema concept in the 1980s (e.g., Alba and Hasher 1983, Fiske and Linville 1980). First, schemas were criticized as definitionally vague, with little consensus regarding concept boundaries. In other words, it is unclear what differentiates schemas from other kinds of knowledge, attitudes or attributions. Second, research on schemas has been attacked as operationally vague, especially because of unvalidated manipulations. For example, researchers sometimes assume that schemas are activated by cues or expectancies in the absence of manipulation checks evidencing such activation. Third, work on schemas has been denounced as nonfalsifiable, in that virtually any result can be interpreted as reflecting schema use. Fourth, it has been argued that schema theory has difficulty accounting for the complexity and accuracy of memory. For instance, there is evidence that people encode and remember more than simply the gist of events, and that schematic distortions are the exception rather than the rule. Finally, some critics have complained that schema theory and research amounts to little more than the reframing of previously known phenomena in terms of a new, schematic vocabulary. For example, biases in person perception documented as early as the 1940s have since been reframed as schema effects. On the other hand, schema theory survived such criticisms because its strengths clearly outweigh its weaknesses. The breadth of the ‘vague’ schema concept allows it to encompass a wide variety of phenomena, ranging from autobiographical memory to visual perception. In many such contexts, the schema concept has been embedded in elaborate theory, allowing for more precise definition, manipulation, and measurement. With such refinements, schema theory often makes specific, falsifiable predictions that lead to clearer, if not totally unambiguous, interpretations. Additionally, modern schema research has increasingly focused on the ways that schemas facilitate, rather than interfere with, memory, accounting better for the frequent accuracy of memory. Overall, the schema concept has proven to be tremendously heuristic, leading to novel research that has documented important new phenomena.
Social psychologists have examined the effects of trait schemas on the processing of schema-related information (see Taylor and Crocker 1981). This research found patterns of recall for congruent, incongruent, and irrelevant information paralleling the patterns described earlier for cognitive stimuli. Specifically, people best recall trait-incongruent information, followed by trait-congruent information, with traitirrelevant information last. For example, if people have an impression (schema) of Jamie as honest, they would best recall his dishonest behaviors (e.g., embezzling money from the honor society), because this information is surprising and needs to be reconciled with Jamie’s other behaviors. They would next best recall Jamie’s honest behaviors (e.g., returning a lost wallet), because these are expected and, thus, readily fit into the existing schema. Jamie’s honesty-irrelevant behaviors (e.g., consulting on statistical issues) would be worst recalled because they do not relate in any way to the honesty schema. Perhaps one of the most significant contributions of this social psychological work has been to move beyond the differential attention explanation for these findings. Such effects are now understood to also reflect behavior-to-behavior or behavior-to-schema interconnections that are created as people think about how this information can be reconciled. Incongruent information actually becomes most interconnected as people mull it over while trying to make sense of it, and irrelevant information remains least interconnected, because it warrants little reflection.
13528
Social psychologists have applied the schema concept to complex interpersonal phenomena often using cognitive methods and theory. Given this approach, research on schemas has become an integral part of the subfield known as social cognition. Such research has not only elaborated the kinds of information processing effects described in the cognitive literature, but has also led to the identification of a wide variety of new phenomena. Schematic Processing of Social Information
4.2 Self-schemas Social psychologists have also used the concept of schema to understand the nature of the self, which had been debated by psychologists for a century (see Linville and Carlston 1994). Most early empirical work focused on the content of the self-concept, though such work was hindered by vague conceptions of self and by the idiosyncratic nature of self-knowledge. The conception of self as a schema provided a more coherent definition, and shifted the empirical emphasis from the content of the self to the functions of self-schemas. The consequence was a proliferation
Schemas, Social Psychology of of social psychological research in this area. Considerable empirical effort went into assessing whether the self is a unique kind of schema, distinct from other forms of social knowledge. Although there are exceptions, most studies have failed to find convincing evidence that self-schemas are unique or distinct from other forms of knowledge. Consequently, the seemingly ‘unique’ effects of self-schemas may simply reflect the greater complexity and personal relevance of selfknowledge. Researchers have also explored whether people have just one, unified self-schema, or whether people have multiple self-schemas. Most contemporary theories depict people as having multiple self-schemas, with the one that is currently activated (in use) being referred to as the working or phenomenal self. Several approaches to the idea of multiple selves have been highly influential. One approach (self-complexity theory) suggests that people have different self-schemas representing their varied social roles and relationships such as student, parent, athlete, lover, and so on. When many such self-schemas implicate different attributes, then failures in one realm may be offset by successes in another, leading to emotional buffering and enhanced mental health. On the other hand, when one has few self-schemas, and these overlap considerably, then failure in any realm can have a devastating affect on one’s self-esteem. Another approach (self-discrepancy theory) identifies three kinds of self-schema: the actual self that we each believe describes us as we are, the ideal self that we aspire to become, and the ought self that others expect us to live up to. According to this theory, increased (rather than decreased) overlap among selfschemas leads to enhanced mental health. For example, discrepancies between the actual and ideal selves leads to dysphoria, and discrepancies between the actual and ought selves leads to anxiety. Theorists now construe multiple self-schemas in terms of the different kinds of information that is activated or accessible at any given time. A number of factors affect which selfschema is currently activated. The contexts or situations we find ourselves in can cue particular selfschemas. For instance, a work setting generally will cue our professional selves whereas a home setting will generally cue our familial selves. The self-aspects that are most salient to an individual are also influenced by individual differences. For example, members of Mensa may chronically categorize themselves and others in terms of intelligence. Such ‘schematics’ tend to view the chronic trait as important and to rate themselves highly on that dimension compared with ‘aschematics.’ A related issue involves whether self-schemas are stable or change over time. In general, schemas are viewed as fairly stable representations, and this is even more pronounced for self-schemas. However in recent years, research has focused on factors underlying the occasional malleability of the self-concept. Re-
searchers have shown that major life events, such as losing a job or experiencing the death of a loved one, can affect the self-schema substantially. Additionally, our self-schemas can change when we obtain new selfknowledge (e.g. learning our IQ). Relatedly, our selfschemas can be influenced by how others think about and treat us. For example, research suggests that if teachers treat students as incompetent, the students tend to incorporate this view into their self-schema and behave accordingly. Finally, our self-schemas change over time, reflecting normal age-related changes in maturity, social and cognitive development, and complexity.
4.3 Biases in Person Perception As noted earlier, schemas serve a number of cognitive functions, such as providing a framework for interpreting information and furnishing default values for unobserved or unrecalled details. When these inferences are incorrect, they produce biases and errors. For example, impressions of others can be influenced by context schemas, as when a person is viewed as more athletic when encountered at a gym than when encountered at a funeral. Similarly, impressions of a new acquaintance can be influenced by others present in the environment, even when there is no logical reason for this. For instance, relatives of physically disabled people may be viewed as having some of the same limitations simply because the disability schema has been activated. Impressions can also be influenced by deT ja u effects. If you meet someone new who resembles someone you already know, activation of the resembled-person’s schema may cause you to assume that the new person has similar traits and characteristics. Role schemas can also bias impressions. If you know two nuns, you may confuse them because they activate similar role schemas. These are just a couple examples of many ways in which schemas can alter impressions.
4.4 Types of Social Schema As the preceding review suggests, social psychologists have identified many different types of social schema. These include previously-mentioned role schemas (e.g., occupations), relationship schemas (e.g., parent), and trait schemas (e.g., honesty). Additionally, impressions of individuals have been construed as person schemas. Event schemas represent common ‘scripts’ as ordered sequences of actions that comprise a social event, such as attending a wedding or dining at a restaurant. Such event schemas facilitate memory for social events, as evidenced by research showing that people have difficulty remembering events that are presented to them in an illogical order. There are also nonverbal schemas composed of series of physical acts 13529
Schemas, Social Psychology of (sometimes called procedural schemas), such as a schema for riding a bike. Similarly, simple judgment rules or heuristics (e.g., smooth talkers are more believable) are sometimes viewed as nonverbal, procedural schemas. Finally, a wide variety of stereotype schemas have been identified in the social literature. Research suggests that race, age, and gender schemas, in particular, are used automatically to categorize others, presumably because these cues are visually salient and our culture defines them as important. Some research has suggested that these automatic schema effects can be overridden when people have sufficient awareness, motivation and cognitive resources to do so (e.g., Devine 1989). However, this issue remains controversial.
Higgins E T 1987 Self discrepancy: A theory relating self and affect. Psychological Reiew 94: 319–40 Linville P W 1985 Self complexity and affective extremity: Don’t put all of your eggs in one cognitive basket. Social Cognition 3: 94–120 Linville P W, Carlston D E 1994 Social cognition of the self. In: Devine P G, Hamilton D L, Ostrom T M (eds.) Social Cognition: Impact on Social Psychology. Academic Press, San Diego Smith E R 1998 Mental representation and memory. In: Gilbert D T, Fiske S T, Lindzey G (eds.) The Handbook of Social Psychology, 4th edn. Oxford University Press, New York Taylor S E, Crocker J 1981 Schematic bases of social information processing. In: Higgins E T, Herman C P, Zanna M P (eds.) Social Cognition: The Ontario Symposium Vol. 1. Erlbaum, Hillsdale, NJ
D. E. Carlston and L. Mae
5. Future Directions The term schema is no longer in vogue, although the essential features of the concept of schemas have been incorporated within broader theories of knowledge structure and mental representation. Work in this area promises to become increasingly sophisticated in several respects. Sophisticated new cognitive theories (e.g., connectionism) provide a better understanding of how schemas form and evolve as new information is acquired. Theorists are better defining and differentiating different types of cognitive structures and mechanisms. Research is providing better evidence for the kinds of representations assumed to underlie different phenomena. Such developments are refining our understanding of the schema concept and how the mind works in general. Ultimately, this work will prove most interesting and impactful as it demonstrates that schematic representations have behavioral as well as cognitive consequences. See also: Piaget’s Theory of Child Development; Schemas, Frames, and Scripts in Cognitive Psychology; Social Psychology, Theories of
Bibliography Alba J W, Hasher L 1983 Is memory schematic? Psychological Bulletin 93: 203–31 Bartlett F C 1932 Remembering. Cambridge University Press, Cambridge, UK Bartlett F C 1967 Remembering. A Study in Experimental and Social Psychology. Cambridge University Press, London Brewer W F, Nakamura G V 1984 The nature and functions of schemas. In: Wyer Jr. R S, Srull T K (eds.) Handbook of Social Cognition Vol. 1. Erlbaum, Hillsdale, NJ Devine P G 1989 Stereotypes and prejudice: Their automatic and controlled components. Journal of Personality and Social Psychology 56: 5–18 Fiske S T, Linville P W 1980 What does the schema concept buy us? Personality and Social Psychology Bulletin 6: 543–57 Hastie R 1981 Schematic principles in human memory. In: Higgins E T, Herman C P, Zanna M P (eds.) Social Cognition: The Ontario Symposium. Erlbaum, Hillsdale, NJ, Vol. 1
13530
Schizophrenia Schizophrenia is a psychosis—a severe mental disorder in which the person’s emotions, thinking, judgment, and grasp of reality are so disturbed that his or her functioning is seriously impaired. The symptoms of schizophrenia are often divided into ‘positive’ and ‘negative.’ Positive symptoms are abnormal experiences and perceptions like delusions, hallucinations, illogical and disorganized thinking, and inappropriate behavior. Negative symptoms are the absence of normal thoughts, emotions, and behavior; such as blunted emotions, loss of drive, poverty of thought, and social withdrawal. Despite common features, different forms of schizophrenia are quite dissimilar. One person, for example, may be paranoid, constantly bothered by voices warning him or her about plots or threats, but able to show good judgment and high functioning in many areas of life. Another may be bizarre in manner and appearance, preoccupied with delusions of bodily disorder, passive and withdrawn. So marked are the differences that many researchers believe that the illness will eventually prove to be a set of different conditions which lead to somewhat similar consequences.
1. Diagnostic Difficulties It is difficult to define schizophrenia precisely. The two most common functional psychoses are schizophrenia and bipolar disorder (also known as manic-depressive illness). The distinction between the two is not easy to make and psychiatrists in different parts of the world at different times have drawn the boundaries in different ways. Bipolar disorder is an episodic disorder in which psychotic symptoms are associated with severe alterations in mood—at times elated, agitated episodes of mania, at other times depression, with
Schizophrenia physical and mental slowing, despair, guilt, and low self-esteem. The course of schizophrenia, by way of contrast, though fluctuating, tends to be more continuous, and the person’s display of emotion is likely to be incongruous or lacking in spontaneity. Markedly illogical thinking is common in schizophrenia. Auditory hallucinations may occur in either bipolar disorder or schizophrenia, but in schizophrenia they are more likely to be commenting on the person’s actions or to be conversing one with another. Delusions, also, can occur in both conditions; in schizophrenia they may give the individual the sense that he or she is being controlled by outside forces or that his or her thoughts are being broadcast or interfered with.
2. Schizophrenia is Uniersal Schizophrenia is a universal condition and an ancient one. Typical cases are evident in the medical writings of ancient Greece and Rome, and the condition occurs today in every human society. While the content of delusions and hallucinations varies from culture to culture, the form of the illness is similar everywhere. Two World Health Organization studies, applying a standardized diagnostic approach, have identified characteristic cases of schizophrenia in developed and developing world countries from around the globe ( World Health Organization 1979, Jablensky et al. 1992). One of these studies (Jablensky et al. 1992) demonstrated that the rate of occurrence of new cases (the incidence) is similar in every country studied from India to Ireland—around one per 10,000 adults each year. However, since both death and recovery rates for people with psychosis are higher in the Third World, the point prevalence of schizophrenia (the number of cases existing at any point in time) is lower in the Third World—around 3 per 1,000 of the adult population compared to 6 per 1,000 in the developed world ( Warner and de Girolamo 1995). The risk of developing the illness at some time in one’s life (the lifetime prevalence) is a little higher—around one percent of the developed world population. See Mental Illness, Epidemiology of.
3. Conceptual Framework The bio-psychosocial model of illness clarifies how different factors shape schizophrenia. The model posits that the predisposition to developing an illness, its onset, and its course are each influenced by biological, psychological, and sociocultural factors. A variety of factors can affect the different phases of schizophrenia, many being environmental. Some, such as genetics, gender, and synaptic pruning are innate. Biological, psychological and social factors are involved to some extent in most phases of schizophrenia. In general, however, the research suggests that the factors responsible for the predisposition to developing an illness are more likely to be biological, that
psychological factors are often important in triggering the onset, and that the course and outcome are particularly likely to be influenced by sociocultural factors.
4. The Course and Outcome of Schizophrenia Wide variation occurs in the course of schizophrenia. In some cases the onset of illness is gradual, extending over months or years; in others it begins suddenly. Some have episodes of illness lasting weeks or months with full remission of symptoms between each episode; others have a fluctuating course in which symptoms are continuous; others again have little variation over the course of years. Swiss psychiatrist, Luc Ciompi, studied the onset, course, and outcome of illness in people with schizophrenia, following them into old age (Ciompi 1980). He found that the onset of the illness was either acute (less than six months from first symptoms to full-blown psychosis) or insidious in roughly equal numbers of cases; the course was episodic or continuous in approximately equal numbers of patients; and the outcome was moderate to severe disability in half the cases and mild disability or full recovery in the other half. Despite popular and professional belief, schizophrenia does not have a progressive, downhill course with universally poor outcome. In fact, schizophrenia usually becomes less severe as the sufferer grows older. A review of outcome studies conducted in Europe and North America throughout the twentieth century reveals that, over the course of months or years, 20 to 25 percent of people with schizophrenia recover completely from the illness—all their psychotic symptoms disappear and they return to their previous level of functioning. Another 20 percent continue to have some symptoms, but are able to lead satisfying and productive lives ( Warner 1994). In the developing countries, recovery rates are even better. The two World Health Organization studies mentioned above ( World Health Organization 1979, Jablensky et al. 1992) have shown that good outcome occurs in about twice as many patients diagnosed with schizophrenia in the developing world as in the developed world. The reason for the better outcome in the Third World is not completely understood, but it may be that many people with mental illness in developing world villages are better accepted, less stigmatized, and more likely to find work in a subsistence agricultural economy ( Warner 1994).
5. Factors Affecting the Course of Schizophrenia Age of onset: The later in life the illness begins, the milder it proves to be. Onset of schizophrenia before the age of 14 is rare, but when it does begin this early it is associated with a severe course of illness. Onset after the age of 40 is also rare, and is associated with a milder course. 13531
Schizophrenia Gender: Women usually develop their first symptoms of schizophrenia later than men and the course of their illness tends to be less severe. The reason for these differences is not clear, but may be related to the protective effect of female hormones on brain development and function. Stressful life eents: Stress can trigger episodes of schizophrenia. People with schizophrenia are more likely to report a stressful life event preceding an episode of illness than during a period of remission. Similarly, stressful events are more likely to occur prior to an episode of schizophrenia than in the same time period for people drawn from the general population (Rabkin 1982). Life stress is also more common before the first episode of schizophrenia and so, although stress does not cause the illness, it may well influence the timing of onset. The research indicates that the life events occurring before episodes of schizophrenia are milder and less objectively troublesome than those before episodes of other disorders such as depression (Beck and Worthen 1972), suggesting that people with schizophrenia are exquisitely sensitive to stress. Domestic stress: The robust results of the ‘expressed emotion’ (EE) research, conducted in several countries in the developed and developing worlds, reveal that people with schizophrenia living with relatives who are critical or over-involved (referred to in the research as high EE) have a higher relapse rate than those living with relatives who are less critical or intrusive (low EE) (Leff and Vaughn 1985, Parker and Hadzi-Pavlovic 1990). A meta-analysis of 26 EE studies of schizophrenia conducted in 11 countries indicates that the relapse rate over a two-year follow-up period was more than twice as high, at 66 percent, for patients in families which included a high EE relative than in low EE households (29 percent) (Kavanagh 1992). Other studies have shown that relatives who are less critical and over-involved exert a positive therapeutic effect on the person with schizophrenia, their presence leading to a reduction in the patient’s level of arousal (Sturgeon et al. 1981). There is no indication that the more critical and over-involved relatives are abnormal by everyday Western standards; low EE family members may be unusually low-key and permissive (Angermeyer 1983). Several studies have shown that family psychoeducational interventions can lead to a change in the level of criticism and over-involvement among relatives of people with schizophrenia and so reduce the relapse rate (Falloon et al. 1982, Berkowitz et al. 1981). Effective family interventions provide three basic ingredients: (a) education about the illness; (b) help in developing problem-solving mechanisms; and (c) practical and emotional support (McFarlane 1983, Falloon et al. 1982, Leff and Vaughn 1985). Substance use: Drug and alcohol abuse is more common among people with serious mental illness. Unemployment, social isolation, and alienation may 13532
contribute to these high rates. Several studies have shown that people with serious mental illness who abuse substances have a worse course of illness (Drake and Wallach 1989) but other researchers have found psychopathology to be no worse or, sometimes, lower among people with mental illness who use substances (Zisook et al. 1992, Buckley et al. 1994). One reason for this discrepancy may lie in the common finding that substance users are also more likely to be noncompliant with treatment (Drake and Wallach 1989); the poor course of illness, when it is observed, may be a result of this noncompliance rather than a direct consequence of substance use.
6. Causes of Schizophrenia There is no single organic defect or infectious agent which causes schizophrenia, but a variety of factors increase the risk of developing the illness, including genetic factors and obstetric complications. Genetic factors. Relatives of people with schizophrenia have a greater risk of developing the illness, the risk being progressively higher among those who are more genetically similar to the person with schizophrenia (see Fig. 1). For a second-degree relative the lifetime risk is about two percent (twice the risk for someone in the general population); for a first-degree relative is about ten percent, and for an identical twin (genetically identical to the person with schizophrenia) the risk is close to 50 percent (Gottesman 1991). Studies of people adopted in infancy reveal that the increased risk of schizophrenia among the relatives of people with the illness is due to inheritance rather than environment. The children of people with schizophrenia have the same increased prevalence of the illness whether they are raised by their biological parent with schizophrenia or by adoptive parents (Gottesman 1991, Warner and de Girolamo 1995). There is evidence implicating several genes in causing schizophrenia ( Wang et al. 1995, Freedman et al. 1997), and it is likely that more than one is responsible, either through an interactive effect or by producing different variants of the disorder. See Mental Illness, Genetics of. Obstetric complications. A review and meta-analysis of studies conducted prior to mid-1994 on the effect of obstetric complications in schizophrenia, reveals that complications before and around the time of birth appear to double the risk of developing the illness (Geddes and Lawrie 1995). In a more recent metaanalysis of a different sample of studies, the same researchers found the risk to be increased by a factor of 1.4. Since these analyses were published, more recent studies have shown variable results. Data gathered at the time of birth from very large cohorts of children born in Finland and Sweden in the 1960s and 1970s indicate that various obstetric complications double
Schizophrenia General population 12.5% 3rd degree relatives
25% 2nd degree relatives
50% 1st degree relatives
1%
First cousins
2%
Uncles/Aunts
2%
Nephews/Nieces
4%
Grandchildren
5%
Half siblings
6%
Parents
6%
Siblings
9%
Children
13%
Fraternal twins 100%
17%
Identical twins
48% 0
Genes shared
Relationship to person with schizophrenia
10 20 30 40 50 Lifetime risk of developing schizophrenia
Figure 1 The average risk of developing schizophrenia for relatives of a person with the illness, compiled from family and twin studies conducted in Europe between 1970 and 1987. Reprinted by permission of the author from Gottesman (1991, p. 96) # 1991 Irving I. Gottesman
or triple the risk of developing schizophrenia (Hultman et al. 1999, Dalman et al. 1999, Jones et al. 1998). An American study shows that the risk of schizophrenia is more than four times greater in those who experience oxygen deprivation before or at the time of birth, and that such complications increase the risk of schizophrenia much more than other psychoses like bipolar disorder (Zornberg et al. 2000). Recent Scottish research, on the other hand, found the effect of obstetric complications to be less than in prior studies (Kendell et al. 2000). Other recent research suggests that only early-onset (before age 30) cases of schizophrenia are associated with obstetric complications (Byrne et al. 2000, Cannon et al. 2000). Obstetric complications are a statistically important risk because they are so common. In the general population, they occur in up to 40 percent of births (the precise rate of occurrence depending on how they are defined) (McNeil 1988, Geddes and Lawrie 1995). The authors of the meta-analyses cited above estimate that complications of pregnancy and delivery may increase the prevalence of schizophrenia by 20 percent (Geddes and Lawrie 1995). The obstetric complications most closely associated with the increased risk of developing schizophrenia are those that induce fetal oxygen deprivation, particularly prolonged labor (McNeil 1988), and placental complications (Jones et al. 1998, Hultman et al. 1999, Dalman et al. 1999). Early delivery is also more
common for those who go on to develop schizophrenia, and infants who suffer perinatal brain damage are at a much-increased risk of subsequent schizophrenia (Jones et al. 1998). Trauma at the time of labor and delivery, and especially prolonged labor, is associated with an increase in structural brain abnormalities—cerebral atrophy and small hippocampi— which occur frequently in schizophrenia (McNeil et al. 2000). Viruses. The risk of intrauterine brain damage is increased if a pregnant woman contracts a viral illness. We know that more people with schizophrenia are born in the late winter or spring than at other times of year, and that this birth bulge sometimes increases after epidemics of viral illnesses like influenza, measles, and chickenpox. Maternal viral infections, however, probably account for only a small part of the increased risk for schizophrenia ( Warner and de Girolamo 1995).
7. Myths about the Causes of Schizophrenia Parenting: Contrary to the beliefs of professionals prior to the 1970s and to the impression still promoted by the popular media, there is no evidence, even after decades of research, that family or parenting problems cause schizophrenia. As early as 1948, psychoanalysts proposed that mothers fostered schizophrenia in their 13533
Schizophrenia offspring through cold and distant parenting. Other theorists blamed parental schisms, and confusing patterns of communication within the family (Lidz et al. 1965, Laing and Esterton 1970). The double-bind theory, put forward by anthropologist Gregory Bateson, argued that schizophrenia is promoted by contradictory parental messages from which the child is unable to escape (Bateson et al. 1956). While enjoying broad public recognition, such theories have seldom been adequately tested, and none of the research satisfactorily resolves the question of whether differences found in the families of people with schizophrenia are the cause or the effect of psychological abnormalities in the disturbed family member (Hirsch and Leff 1975). Millions of family members of people with schizophrenia have suffered needless shame, guilt, and stigma because of this widespread misconception. Drug abuse: Drug abuse does not cause schizophrenia, though it is possible, but by no means certain, that it can trigger the onset of the illness. Hallucinogenic drugs like LSD can induce short episodes of psychosis and heavy use of marijuana and stimulant drugs like cocaine and amphetamines may precipitate brief, toxic psychoses with features similar to schizophrenia (Bowers 1987), but there is no evidence that these drugs cause a long-lasting illness like schizophrenia. In the 1950s and 1960s, LSD was used as an experimental drug in psychiatry in Britain and America, but the proportion of these volunteers and patients who developed a long-lasting psychosis was scarcely greater than in the general population (Cohen 1960, Malleson 1971). A Swedish study found that army conscripts who used marijuana heavily were six times more likely to develop schizophrenia later in life (Andreasson et al. 1987), but this may well have been because those people who were destined to develop schizophrenia were more likely to use marijuana as a way to cope with the premorbid symptoms of the illness. Schizophrenia is preceded by a long period of prodromal symptoms, and a German study has demonstrated that the onset of drug and alcohol abuse in people with schizophrenia usually follows the very first negative symptom of schizophrenia (such as social withdrawal) but precedes the first positive symptom (such as hallucinations). The authors conclude that substance use is an avenue to the relief of the earliest symptoms of the illness, but is not a cause (Hambrecht and Hafner 1995).
8. The Brain in Schizophrenia Physical changes in the brain have been identified in some people with schizophrenia. The analysis of brain tissue after death has revealed a number of structural abnormalities, and new brain-imaging techniques have revealed changes in both the structure and function of the brain during life. Techniques such as magnetic resonance imaging (MRI) reveal changes in the size of 13534
different parts of the brain, especially the temporal lobes. The fluid-filled spaces (the ventricles) in the interior of the temporal lobes are often enlarged and the temporal lobe tissue diminished. The greater the observed changes the greater the severity of the person’s thought disorder and auditory hallucinations (Suddath et al. 1990). Some imaging techniques, such as positron emission tomography (PET), measure the actual functioning of the brain and provide a similar picture of abnormality. PET scanning reveals hyperactivity in the temporal lobes, particularly in the hippocampus, a part of the temporal lobe concerned with episodic memory (Tamminga et al. 1992). Another type of functional imaging, electrophysiological brain recording using EEG tracings, shows that most people with schizophrenia seem to be excessively responsive to repeated environmental stimuli and more limited in their ability to blot out irrelevant information (Freedman et al. 1997). In line with this finding, those parts of the brain that are supposed to screen out irrelevant stimuli, such as the frontal lobe, show decreased activity on PET scan (Tamminga et al. 1992). Tying in with this sensory screening difficulty, postmortem brain tissue examination has revealed problems in a certain type of brain cell—the inhibitory interneuron. These neurons damp down the action of the principal nerve cells, preventing them from responding to too many inputs. Thus, they prevent the brain from being overwhelmed by too much sensory information from the environment. The chemical messengers or neurotransmitters ( primarily gammaamino butyric acid or GABA) released by these interneurons are diminished in the brains of people with schizophrenia (Benes et al. 1991, Akbarian et al. 1993) suggesting that there is less inhibition of brain overload. Abnormality in the functioning of these interneurons appears to produce changes in the brain cells that release the neurotransmitter dopamine. The role of dopamine has long been of interest to schizophrenia researchers, because drugs like amphetamines that increase dopamine’s effects can cause psychoses that resemble schizophrenia, and drugs that block or decrease dopamine’s effect are useful for the treatment of psychoses (Meltzer and Stahl 1976). Dopamine increases the sensitivity of brain cells to stimuli. Ordinarily, this heightened awareness is useful in increasing a person’s awareness during times of stress or danger, but, for a person with schizophrenia, the addition of the effect of dopamine to an already hyperactive brain state may tip the person into psychosis. These findings suggest that in schizophrenia there is a deficit in the regulation of brain activity by interneurons, so that the brain over-responds to environmental signals and lacks the ability to screen out unwanted stimuli. This problem is made worse by a decrease in the size of the temporal lobes, which ordinarily process
Schizophrenia sensory inputs, making it more difficult for the person to respond appropriately to new stimuli. (See MRI (Magnetic Resonance Imaging) in Psychiatry.)
9. Why Does Schizophrenia Begin after Puberty? Schizophrenia researchers have long been puzzled about why the illness normally begins in adolescence when important risk factors, such as genetic loading and neonatal brain damage, are present from birth or sooner. Recent research attempts to address the question. Normal brain development leads to the loss of 30–40 percent of the connections (synapses) between brain cells during the developmental period from early life to adolescence (Huttenlocher 1979). Brain cells themselves do not diminish in number during this period, only their connectivity. It appears that we may need a high degree of connectivity between brain cells in infancy to enhance our ability to learn language rapidly (toddlers learn as many as twelve new words a day). The loss of neurons during later childhood and adolescence, however, improves our ‘working memory’ and our efficiency to process complex linguistic information (McGlashan and Hoffman 2000). For people with schizophrenia, this normally useful process of synaptic pruning is excessive, leaving fewer synapses in the frontal lobes and medial temporal cortex (Feinberg 1983). In consequence, there are deficits in the interaction between these two areas of the brain in schizophrenia, which reduce the adequacy of working memory ( Weinberger et al. 1992). One intriguing computer modeling exercise suggests that decreasing synaptic connections and eroding working memory in this way not only leads to abnormalities in the ability to recognize meaning when stimuli are ambiguous but also to the development of auditory hallucinations (Hoffman and McGlashan 1997). It is possible, therefore, that this natural and adaptive process of synaptic elimination in childhood, if carried too far, could lead to the development of schizophrenia (Feinberg 1983). If true, this would help explain why schizophrenia persists among humans despite its obvious functional disadvantages and its association with reduced fertility. The genes for synaptic pruning may help us refine our capacity to comprehend speech and other complex stimuli, but, when complicated by environmental assaults resulting in brain injury, the result could be symptoms of psychosis. As yet, this formulation is speculative.
10. Effectie Interentions in Schizophrenia There is more agreement now about what is important in the treatment of schizophrenia than ever before. In establishing the World Psychiatric Association global project designed to combat the stigma and discrimi-
nation resulting from schizophrenia ( Warner 2000), prominent psychiatrists from around the world recently agreed on the following principles: People with schizophrenia can be treated effectively in a variety of settings. These days the use of hospitals is mainly reserved for those in an acute relapse. Outside of the hospital, a range of alternative treatment settings have been devised which provide supervision and support and are less alienating and coercive than the hospital ( Warner 1995). Family involvement can improve the effectiveness of treatment. A solid body of research has demonstrated that relapse in schizophrenia is much less frequent when families are provided with support and education about schizophrenia. Medications are an important part of treatment but they are only part of the answer. They can reduce or eliminate positive symptoms but they have a negligible effect on negative symptoms. Fortunately, modern, novel antipsychotic medications, introduced in the past few years, can provide benefits while causing less severe side effects than the standard antipsychotic drugs which were introduced in the mid-1950s. Treatment should include social rehabilitation. People with schizophrenia usually need help to improve their functioning in the community. This can include training in basic living skills; assistance with a host of day-to-day tasks; and job training, job placement, and work support. The psychosocial clubhouse is one effective model for providing many of these forms of assistance (Mosher and Burti 1989). The assertive community treatment model has proven effective in preventing relapse and hospital admission (Stein and Test 1980). Work helps people recover from schizophrenia. Productive activity is basic to a person’s sense of identity and worth. The availability of work in a subsistence economy may be one of the main reasons that outcome from schizophrenia is so much better in Third World villages ( Warner 1994). Given training and support, most people with schizophrenia can work, as has been demonstrated by several north Italian employment programs ( Warner 1994). However, due to problems such as work disincentives in disability pension schemes, high general unemployment and inadequate vocational rehabilitation services, the employment of people with schizophrenia in Britain and the United States has routinely been as low as 15 percent in recent years ( Warner 2000). People with schizophrenia can get worse if treated punitively or confined unnecessarily. Extended hospital stays are rarely necessary if good community treatment is available. Jails or prisons are not appropriate places of care. Yet, around the world, large numbers of people with schizophrenia are housed in prison cells, usually charged with minor crimes, largely because of the lack of adequate community treatment. People with schizophrenia and their family members should help plan and even deliver treatment. 13535
Schizophrenia Consumers of mental health services can be successfully employed in treatment programs, and when they help train treatment staff, professional attitudes and patient outcome both improve (Sherman and Porter 1991, Warner 2000). People’s responses towards someone with schizophrenia influence the person’s course of illness and quality of life. Negative attitudes can push people with schizophrenia and their families into hiding the illness and drive them away from help. If people with schizophrenia are shunned and feared they cannot be genuine members of their own community. They become isolated and victims of discrimination in employment, accommodation, and education ( Warner 2000). The recent US Surgeon General’s report on mental illness cited stigma as one of the most important obstacles to effective treatment (US Department of Health and Human Services 1999). See also: Depression, Clinical Psychology of; Developmental Psychopathology: Child Psychology Aspects; Differential Diagnosis in Psychiatry; Mental and Behavioral Disorders, Diagnosis and Classification of; Mental Illness, Epidemiology of; Mental Illness, Genetics of; Psychiatric Assessment: Negative Symptoms; Schizophrenia and Bipolar Disorder: Genetic Aspects; Schizophrenia: Neuroscience Perspective; Schizophrenia, Treatment of
Bibliography Akbarian S, Vinuela A, Kim J J, Potkin S G, Bunney W E J, Jones E G 1993 Distorted distribution of nicotinamideadenine dinucleotide phosphate-diaphorase neurons in temporal lobe of schizophrenics implies anomalous cortical development. Archies of General Psychiatry 50: 178–87 Andreasson S, Allebeck P, Engstrom A, Rydberg U 1987 Cannabis and schizophrenia: A longitudinal study of Swedish conscripts. Lancet 1987(2): 1483–86 Angermeyer M C 1983 ‘Normal deviance’: Changing norms under abnormal circumstances. Presented at the Seventh World Congress of Psychiatry, Vienna Bateson G, Jackson D, Haley J 1956 Towards a theory of schizophrenia. Behaioral Science 1: 251–64 Beck J, Worthen K 1972 Precipitating stress, crisis theory, and hospitalization in schizophrenia and depression. Archies of General Psychiatry 26: 123–9 Benes F M, McSparran I, Bird E D, San Giovani J P, Vincent S L 1991 Deficits in small interneurons in prefrontal and cingulate cortices of schizophrenic and schizoaffective patients. Archies of General Psychiatry 48: 996–1001 Berkowitz R, Kuipers L, Eberlein-Fries R, Leff J D 1981 Lowering expressed emotion in relatives of schizophrenics. New Directions in Mental Health Serices 12: 27–48 Bowers M B 1987 The role of drugs in the production of schizophreniform psychoses and related disorders. In: Meltzer H Y (ed.) Psychopharmacology: The Third Generation of Progress. Raven Press, New York Buckley P, Thompson P, Way L, Meltzer H Y 1994 Substance abuse among patients with treatment-resistant schizophrenia:
13536
Characteristics and implications for clozapine therapy. American Journal of Psychiatry 151: 385–9 Byrne M, Browne R, Mulryan N, Scully A, Morris M, Kinsella A, McNeil T, Walsh D, O’Callaghan E 2000 Labour and delivery complications and schizophrenia: Case-control study using contemporaneous labour ward records. British Journal of Psychiatry 176: 531–6 Cannon T D, Rosso I M, Hollister J M, Bearden C E, Sanchez L E, Hadley T 2000 A prospective cohort study of genetic and perinatal influences in the etiology of schizophrenia. Schizophrenia Bulletin 26: 351–66 Ciompi L 1980 Catamnestic long-term study on the course of life and aging of schizophrenics. Schizophrenia Bulletin 6: 606–18 Cohen S 1960 Lysergic acid diethylamide: Side effects and complications. Journal of Nerous and Mental Disease 130: 30–40 Dalman C, Allebeck P, Cullberg J, Grunewald C, Koster M 1999 Obstetric complications and the risk of schizophrenia. Archies of General Psychiatry 56: 234–40 Drake R E, Wallach M A 1989 Substance abuse among the chronically mentally ill. Hospital and Community Psychiatry 40: 1041–6 Falloon I R H, Boyd J L, McGill C W, Razani J, Moss H B, Gilderman A M 1982 Family management in the prevention of exacerbations of schizophrenia: A controlled study. New England Journal of Medicine 306: 1437–40 Feinberg I 1983 Schizophrenia: Caused by a fault in programmed synaptic elimination during adolescence? Journal of Psychiatric Research 17: 319–34 Freedman R, Coon H, Myles-Worsley M, Orr-Urtreger A, Olincy A, Davis A, Polymeropoulos M, Holik J, Hopkins J, Hoff M, Rosenthal J, Waldo M C, Reimherr F, Wender P, Yaw J, Young D A, Breese C R, Adams C, Patterson D, Adler L E, Kruglyak L, Leonard S, Byerley W 1997 Linkage of a neurophysiological deficit in schizophrenia to a chromosome 15 locus. Proceedings of the National Academy of Sciences of the USA 94: 587–92 Geddes J R, Lawrie S M 1995 Obstetric complications and schizophrenia. British Journal of Psychiatry 167: 786–93 Gottesman I 1991 Schizophrenia Genesis: The Origins of Madness. Freeman, New York Hambrecht M, Hafner H 1995 Substance abuse or schizophrenia: Which comes first? Presented at the World Psychiatric Association Section of Epidemiology and Community Psychiatry Symposium, New York Hirsch S, Leff J 1975 Abnormality in Parents of Schizophrenics. Oxford University Press, London Hoffman R E, McGlashan T H 1997 Synaptic elimination, neurodevelopment, and the mechanism of hallucinated ‘voices’ in schizophrenia. American Journal of Psychiatry 154: 1683–9 Hultman C M, Sparen P, Takei N, Murray R M, Cnattingius S 1999 Prenatal and perinatal risk factors for schizophrenia, affective psychosis, and reactive psychosis of early onset: Case control study. British Medical Journal 318: 421–6 Huttenlocher P R 1979 Synaptic density in the human frontal cortex–developmental changes and effects of aging. Brain Research 163: 195–205 Jablensky A, Sartorius N, Ernberg G, Anker M, Korten A, Cooper J E, Day R, Bertelsen A 1992 Schizophrenia: Manifestations, incidence and course in different cultures: A World Health Organization ten-country study. Psychological Medicine Monograph Supplement 20 Jones P B, Rantakallio P, Hartikainen A-L, Isohanni M, Sipila P 1998 Schizophrenia as a long-term outcome of pregnancy,
Schizophrenia and Bipolar Disorder: Genetic Aspects delivery, and perinatal complications: A 28-year follow-up of the 1996 north Finland general population birth cohort. American Journal of Psychiatry 155: 355–64 Kavanagh D J 1992 Recent developments in expressed emotion and schizophrenia. British Journal of Psychiatry 160: 601–20 Kendell R E, McInneny K, Juszczak E, Bain M 2000 Obstetric complications and schizophrenia: Two case-control studies based on structured obstetric records. British Journal of Psychiatry 176: 516–22 Laing R D, Esterton A 1970 Sanity, Madness and the Family: Families of Schizophrenics. Penguin Books, Baltimore Leff J, Vaughn C 1985 Expressed Emotion in Families. Guilford Press, New York Lidz T, Fleck S, Cornelison A 1965 Schizophrenia and the Family. International Universities Press, New York MacGlashan T H, Hoffman R E 2000 Schizophrenia as a disorder of developmentally reduced synaptic connectivity. Archies of General Psychiatry 57: 637–48 Malleson N 1971 Acute adverse reactions to LSD in clinical and experimental use in the United Kingdom. British Journal of Psychiatry 118: 229–30 McFarlane W R (ed.) 1983 Family Therapy in Schizophrenia. Guilford Press, New York McNeil T F 1988 Obstetric factors and perinatal injuries. In: Tsuang M T, Simpson J C (eds.) Handbook of Schizophrenia: Nosology, Epidemiology and Genetics. Elsevier Science, New York Meltzer H Y, Stahl S M 1976 The dopamine hypothesis of schizophrenia: A review. Schizophrenia Bulletin 2: 19–76 Mosher L, Burti L 1989 Community Mental Health: Principles and Practice. Norton, New York Parker G, Hadzi-Pavlovic D 1990 Expressed emotion as a predictor of schizophrenic relapse: An analysis of aggregated data. Psychological Medicine 20: 961–5 Rabkin J G 1982 Stress and psychiatric disorders. In: Goldberger L, Breznitz S (eds.) Handbook of Stress: Theoretical and Clinical Aspects. Free Press, New York, pp. 566–84 Sherman P S, Porter M A 1991 Mental health consumers as case management aides. Hospital and Community Psychiatry 42: 494–8 Stein L I, Test M A 1980 Alternative to mental hospital treatment: I. Conceptual model, treatment program, and clinical evaluation. Archies of General Psychiatry 37: 392–7 Sturgeon D, Kuipers L, Berkowitz R, Turpin G, Leff J 1981 Psychophysiological responses of schizophrenic patients to high and low expressed emotion relatives. British Journal of Psychiatry 138: 40–5 Suddath R L, Christison G W, Torrey E F, Casanova M F, Weinberger D R 1990 Anatomical abnormalities in the brains of monozygotic twins discordant for schizophrenia. New England Journal of Medicine 322: 789–94 Tamminga C A, Thaker G K, Buchanan R, Kirkpatrick B, Alphs L D, Chase T N, Carpenter W T 1992 Limbic system abnormalities identified in schizophrenia using positron emission tomography with fluorodeoxyglucose and neocortical alterations with deficit syndrome. Archies of General Psychiatry 49: 522–30 US Department of Health and Human Services 1999 Mental Health: A Report of the Surgeon General. US Department of Health and Human Services, Substance Abuse and Mental Health Services Administration, Center for Mental Health Services, National Institutes of Health, National Institute of Mental Health, Rockville, MD Wang S, Sun C, Walczak C A, Ziegle J S, Kipps B R, Goldin L R, Diehl S R 1995 Evidence for a susceptibility locus for
schizophrenia on chromosome 6pter- p22. Nature Genetics 10: 41–6 Warner R 1994 Recoery from Schizophrenia: Psychiatry and Political Economy. Routledge, New York Warner R (ed.) 1995 Alternaties to the Hospital for Acute Psychiatric Care. American Psychiatric Press, Washington, DC Warner R 2000 The Enironment of Schizophrenia: Innoations in Practice, Policy and Communications. Routledge, London Warner R, de Girolamo G 1995 Epidemiology of Mental Problems and Psychosocial Problems: Schizophrenia. World Health Organization, Geneva, Switzerland Weinberger D R, Berman K F, Suddath R, Torrey E F 1992 Evidence of a dysfunction of a prefrontal-limbic network in schizophrenia: A magnetic resonance imaging and regional cerebral blood flow study of discordant monozygotic twins. American Journal of Psychiatry 149: 890–7 World Health Organization 1979 Schizophrenia: An International Follow-up Study. Wiley, Chichester, UK Zisook S, Heaton R, Moranville J, Kuck J, Jernigan T, Braff D 1992 Past substance abuse and clinical course of schizophrenia. American Journal of Psychiatry 149: 552–3 Zornberg G L, Buka S L, Tsuang M T 2000 Hypoxia-ischemiarelated fetal\neonatal complications and risk of schizophrenia and other nonaffective psychoses: A 19-year longitudinal study. American Journal of Psychiatry 157: 196–202
R. Warner
Schizophrenia and Bipolar Disorder: Genetic Aspects Schizophrenia and bipolar mood disorder (the latter sometimes called ‘manic-depressive illness,’) are among the most serious of all psychiatric disorders, indeed of all medical disorders. Both of these psychiatric illnesses tend to have a rather early age of onset, with most patients first becoming ill in their teens or twenties, and the symptoms are often chronic, particularly in the case of schizophrenia. Moreover, these illnesses are often severely disabling, and are associated with increased rates of educational problems, unemployment, marital difficulties, alcohol or substance abuse, and suicide. Schizophrenia and bipolar disorder each affect approximately one percent of the population in the USA and Western Europe. If one takes into account not only the numbers of people affected by schizophrenia or bipolar disorder but also the fact that many patients are seriously disabled for much of their adult lives, then the cost of these two disorders, in both economic and human terms, rivals that of diseases such as heart disease and stroke, which affect more people but tend not to strike until much later in life. This aticle first considers diagnostic issues, including the most widely used diagnostic criteria for schizophrenia, bipolar disorder, and related disorders. Evi13537
Schizophrenia and Bipolar Disorder: Genetic Aspects dence for genetic factors in schizophrenia and bipolar disorder is reviewed next, including some complicating factors, such as the likely presence of etiologic heterogeneity and the interaction of genes with environmental stressors. Issues of genetic counseling are then considered. The chapter concludes with a discussion of evidence that genes for schizophrenia and bipolar disorder may have a ‘silver lining,’ in terms of increased creative potential.
1. Diagnostic Issues 1.1 Schizophrenia and Related Disorders The criteria currently most widely used in diagnosing schizophrenia, bipolar disorder, and related disorders are those described in the most recent (4th) edition of the Diagnostic and Statistical Manual (DSM-IV) of the American Psychiatric Association (1994). Briefly summarized, the diagnostic criteria for schizophrenia listed in DSM-IV require two or more of the following five characteristic symptoms: hallucinations; delusions; disorganized speech; grossly disorganized behavior; and negative symptoms, such as flattened affect. In addition, the patient must have shown significant deterioration of functioning in areas such as occupational or interpersonal relations, and have had continuous signs of illness for at least six months, including at least one month with the characteristic symptoms. DSM-IV also recognizes several related disorders that manifest milder forms of certain symptoms that are often seen in schizophrenia. Schizotypal personality disorder, for example, is used for persons who have several characteristic features (e.g., magical thinking, recurrent illusions, and peculiar behavior or speech). If a person does not have such schizotypal eccentricities but has shown several signs of marked disinterest in interpersonal relationships (e.g., extreme aloofness, absence of any close friends), then a diagnosis of schizoid personality disorder is given. In paranoid personality disorder, there is a broad pattern of extreme suspiciousness, as shown by several signs (e.g., irrational fears that others wish to harm one). Family and adoption studies suggest that these personality disorders are part of a ‘spectrum’ of disorders that are genetically related to schizophrenia proper. 1.2 Bipolar Disorder and Related Disorders In DSM-IV, the diagnosis of ‘bipolar I’ disorder requires a history of at least one episode of mania, which is defined as a period in which a person’s mood is ‘abnormally and persistently’ expansive, elevated, or irritable. In addition, diagnosis of bipolar disorder requires that the mood change either involve psychotic features which last at least a week, or lead to 13538
hospitalization. The disturbance in mood must be severe enough to disrupt seriously social or occupational functioning, and\or to require hospitalization. The manic episode must also involve at least three (four, if only an irritable mood is present) of seven symptoms: (a) greatly inflated self-esteem, (b) increased activity or restlessness, (c) unusual talkativeness, (d) racing thoughts or flight of ideas, (e) decreased need for sleep, (f ) distractibility, and (g) extremely risky actions (such as buying sprees or reckless driving) whose potentially dangerous consequences are not appreciated. The criteria for a hypomanic episode are essentially the same as those for a manic one, except that the symptoms are neither psychotic nor so severe that they severely impair social functioning or require hospitalization. The term bipolar is potentially confusing, because it does not require a history of major depression, even though most persons with bipolar disorder will also have experienced episodes of major depression. (By contrast, in major depressive disorder, a person has experienced an episode of major depression, but not an episode of mania.) As in the case of schizophrenia, there appears to be a ‘spectrum’ of affective disorders that are genetically related to bipolar disorder but that have symptoms which are milder than those found in frank bipolar disorder. Thus DSM-IV also includes (a) bipolar II disorder (in which a patient has experienced a hypomanic, rather than a manic episode, as well as an episode of major depression) and (b) cyclothymic disorder, which involves a history of multiple hypomanic episodes as well as multiple periods of depressive symptoms (but not major depression). Bipolar disorder is also often accompanied by other concurrent, or ‘co-morbid,’ disorders, such as anxiety disorders, or alcohol and substance abuse. Moreover, even when symptoms of mania and depression are in remission, patients with a history of bipolar disorder often still meet criteria for personality disorders, particularly those with narcissistic or histrionic features. 1.3 Differential Diagnosis A diagnosis of schizophrenia requires that diagnoses of mood disorders and schizoaffective disorders be excluded. Diagnoses of schizophrenia, mood disorders, and schizoaffective disorders all require the exclusion of neurologic syndromes caused by factors such as substance abuse, medications, or general medical condition. Until rather recently there has been a tendency, particularly in the USA, for mania with psychotic features to be misdiagnosed as schizophrenia. Accurate differential diagnosis is crucial, because misdiagnosis (and resultant inappropriate treatment) may result, on the one hand, in patients being needlessly exposed to harmful side effects of medication or, on the other hand, being deprived of
Schizophrenia and Bipolar Disorder: Genetic Aspects appropriate medications that may alleviate unnecessary suffering and save patients’ jobs, marriages, even their lives. Mania, for example, often responds well to medications (particularly lithium and certain anticonvulsants such as carbamazepine and valproate) that are less effective in treating schizophrenia. It is important, moreover, to diagnose and treat schizophrenia and bipolar disorder as early as possible in the course of the illness because there is increasing evidence that these illnesses (and their underlying brain pathologies) tend to worsen if left untreated (e.g., Wyatt 1991).
2. Eidence for Genetic Factors in Schizophrenia There are several complementary lines of evidence for an important role of genetic factors in the etiology of schizophrenia. First, there are converging lines of evidence from family, twin, and adoption studies that the risk for schizophrenia is greatly increased among schizophrenics’ biological relatives. Second, investigators have, in recent years, increasingly marshalled techniques from molecular genetics to look for more direct evidence of genetic factors in schizophrenia. 2.1 Family, Twin, and Adoption Studies A person’s risk of developing schizophrenia increases, on average, with his or her increasing degree of genetic relatedness to a schizophrenic patient (e.g., Matthysse and Kidd 1976, Holzman and Matthysse 1990, Torrey et al. 1994). The risk of developing schizophrenia over a person’s lifetime is about 0.8 percent for people in the general population, though there is a several-fold variation in the prevalence of schizophrenia across different populations that have been studied around the world (e.g., see Torrey 1987). By contrast, a person’s lifetime risk is 5–10 percent if the person has a first-degree relative with schizophrenia, and is much higher—nearly 50 percent in some studies—if a person is the monozygotic (genetically identical) twin of a schizophrenic patient. While these risk figures are consistent with genetic transmission, they are not conclusive, because the degree of genetic resemblance among relatives tends to parallel their level of exposure to similar environments. Twin studies have rather consistently reported concordance rates for schizophrenia in monozygotic (MZ) twins that are several times higher than those for dizygotic (DZ) twins (e.g., Gottesman et al. 1987, Torrey et al. 1994). It is also noteworthy that the schizophrenia concordance rate for MZ twins reared apart is quite close to that for MZ twins reared together. On the other hand, the number of such MZ twins reared apart is rather small. Moreover, it is unclear whether these twins who were reared in different settings may still have had significant contact with each other after they were separated, so that the
separation of shared genetic and environmental factors may not have been complete. The most conclusive available evidence for genetic factors in schizophrenia, therefore, comes from adoption studies. For example, Heston (1966) studied 47 adult adoptees who had been born in the USA to a schizophrenic mother, but separated shortly after birth. Five of these ‘index’ adoptees were found to have subsequently developed a diagnosis of schizophrenia, vs. none of 50 matched control adoptees who had been born to demographically matched, but psychiatrically healthy, mothers. More systematic adoption studies of schizophrenia have been carried out in Scandinavia. In Denmark, for example, Kety et al. (1994) were able to identify all individuals in the entire country who had been adopted away from their biological parents at an early age and subsequently were hospitalized with a diagnosis of schizophrenia. For each of these 74 schizophrenic adoptees, a ‘control’ adoptee was identified who was closely matched for age, gender, and the socioeconomic status of the adoptive home. The control adoptees’ biological parents had not been hospitalized for mental illness. Psychiatric diagnoses of over 1100 of the adoptees’ respective biological and adoptive relatives were made after careful review of psychiatric interviews and records. Significantly higher rates of schizophrenia were found in the biological (but not in the adoptive) relatives of schizophrenic adoptees than in the biological relatives of control adoptees (5.0 percent vs. 0.4 percent). The prevalence of schizotypal personality disorder was also significantly elevated among the schizophrenic adoptees’ biological relatives (Kendler et al. 1994). Moreover, rates of schizophrenia and related ‘spectrum’ disorders were significantly elevated even in the schizophrenic adoptees’ biological paternal half-siblings, who had not even shared the same womb as the schizophrenic adoptees. 2.2 Association and Linkage Studies In recent decades, advances in molecular genetics have enabled researchers to look more directly for genetic factors in schizophrenia. One strategy is to investigate ‘candidate genes’ for which there is a theoretical reason to suspect a role in schizophrenia. Thus several groups of investigators have looked for genes that influence susceptibility to certain infectious agents, since exposure to these agents, particularly during pre- or perinatal development, appears to increase risk for schizophrenia. For example, in a number of epidemiological studies, increased risk of exposure to influenza during the middle trimester of gestation has been found to be associated with increased risk of schizophrenic outcome. Individuals’ genotypes can powerfully affect their immunologic response to infections (and their mothers’ ability to combat infections while they are still in utero). Several studies have found that certain alleles of genes that play an important role 13539
Schizophrenia and Bipolar Disorder: Genetic Aspects in immune function, such as those in the HLA complex, are more prevalent in schizophrenia. McGuffin (1989) reviewed nine such studies and concluded that there was a highly significant association between the HLA A9 allele and risk for the paranoid subtype of schizophrenia. Murray et al. (1993) suggested that prenatal exposure to influenza may increase risk of schizophrenia in the offspring because, in genetically susceptible mothers, the flu virus stimulates production of maternal antibodies that cross the placenta and disrupt fetal brain development. Even if one does not have a good ‘candidate’ gene for a disorder such as schizophrenia, however, it is still possible to apply the more indirect strategy of genetic ‘linkage.’ This strategy makes use of the fact that genes that are located very near one another on the same chromosome (and are therefore said to be closely ‘linked’) will tend to be inherited together. As the result of recent advances in molecular genetics, there are now thousands of identified genes that can be used to mark various regions of different chromosomes. One can then study large numbers of ‘multiplex’ families in which there are at least two family members with schizophrenia, in order to examine whether there is a significant tendency for schizophrenia and alleles for particular marker genes to be transmitted together within the same family. In principle, genetic linkage studies provide an elegant approach that makes it possible to identify disease genes whose role in a disease such as schizophrenia is a complete surprise to investigators (and thus would never have been chosen as ‘candidate’ genes for association studies). There are, however, some practical difficulties with linkage studies. One difficulty is that, because hundreds of different marker genes can be examined, there is a rather high probability of obtaining a spurious, or ‘false positive,’ linkage finding by chance. In order to screen out such false positive findings, it is important to determine whether interesting linkage results can be confirmed in new, independent, studies involving large numbers of multiplex families. In fact, linkage findings implicating genes on a particular region of chromosome 6 as risk factors for schizophrenia have recently been reported by several different research groups (e.g., see review by Gershon et al. 1998). 2.3 Etiologic Heterogeneity and Genotype–Enironment Interactions The search for genetic factors is complicated by several factors. It is likely, for example, that disorders such as schizophrenia are etiologically heterogeneous. That is, it is probable that the syndrome of schizophrenia can be produced by a number of different combinations of genes and\or environmental factors. Thus, while each of a number of different genes may well increase risk for schizophrenia, it is likely that no single gene is necessary for the production of most cases. One 13540
approach to the problem of heterogeneity is to identify characteristics that distinguish specific, genetically more homogeneous, subtypes of schizophrenia or bipolar disorder. Maziade et al. (1994), for example, found evidence for linkage at a locus on chromosome 11 with schizophrenia in one, but not the others, of several large pedigrees that they examined. The schizophrenics in the extended family that did show linkage were distinguished from the families that did not by having a particularly severe and unremitting form of schizophrenia. If this finding can be confirmed in other pedigrees, it would provide a valuable example of the subtyping strategy. A related problem is that, while a particular gene or genes may significantly increase one’s risk for developing schizophrenia or bipolar disorder, it will usually not be sufficient to produce the disorder; that is, most individuals who carry a susceptibility gene will not themselves become ill. (That is, there is ‘incomplete penetrance’ of the gene’s effects.) One strategy for dealing with this latter problem is to identify more sensitive, subclinical, phenotypes that indicate the presence of the gene even in people who do not develop the illness. For example, studies of schizophrenics’ families suggest that most schizophrenics carry a gene which leads to schizophrenia only 5–10 percent of the time, but causes abnormal smooth pursuit eye moements over 70 percent of the time. These eye movement dysfunctions should thus provide a better target for genetic linkage studies than schizophrenia itself (Holzman and Matthysse 1990). A further complication is the likelihood of genotype–environment interactions. For example, there is evidence from dozens of studies that pre- and perinatal complications are significant risk factors for schizophrenia (e.g., Torrey et al. 1987). Such complications are, of course, hardly unique to schizophrenia; this suggests that pre- or perinatal insults to the developing brain may interact with genetic liability factors to produce schizophrenia. Kinney et al. (1999), for example, found that schizophrenics were much more likely than were either control subjects or the schizophrenics’ own non-schizophrenic siblings to have both a major perinated complication and eye tracking dysfunction. Moreover, these non-schizophrenic siblings tended to have either a history of perinatal complications or eye tracking dysfunction, but not both in the same sibling. This pattern of findings was consistent with a two-factor model in which perinatal brain injury and specific susceptibility genes often interact to produce schizophrenia.
3. Eidence for Genetic Factors in Bipolar Disorder As in the case of schizophrenia, several converging lines of evidence strongly implicate genetic factors in the etiology of bipolar disorder.
Schizophrenia and Bipolar Disorder: Genetic Aspects 3.1 Family, Twin, and Adoption Studies There is a strong tendency for bipolar disorder to run in families, and the risk of bipolar disorder in a firstdegree relative of a manic-depressive is about 8 percent (vs. about 1 percent in the general population). Further evidence for a high heritability of bipolar disorder is provided by twin studies, particularly three twin studies conducted in Scandinavia in recent decades. The average concordance rate for bipolar disorder in these latter studies was 55 percent in MZ vs. only 5 percent in DZ twin pairs (see review by Vehmanen et al. 1995). Complementary evidence for genetic factors in bipolar disorder is provided by adoption studies. For example, Mendlewicz and Ranier (1977) identified 29 adoptees with bipolar disorder, along with demographically matched controls who were either (a) psychiatrically normal or (b) had had polio during childhood. When the biological and adoptive parents of these adoptees were interviewed, significantly more cases of bipolar disorder, major depression, and schizoaffective disorder were found in the biological parents of bipolar adoptees than in the biological parents of either of the two control groups. The respective groups of adoptive parents, by contrast, did not differ significantly in the prevalence of these disorders. 3.2 Association and Linkage Studies Although there have been dozens of reports of linkage between bipolar disorder and genetic loci on various chromosomes, few of these reports have subsequently been confirmed. Among these few are linkages to particular regions of chromosomes X, 18 and 21; each of these linkages has been confirmed by several independent groups. While this suggests that genes in these regions significantly influence susceptibility for bipolar disorder, the effects of these genes on susceptibility may be modest in size (for reviews, see Gershon et al. 1998 and Pekkarinen 1998). A key difficulty in identifying genes in bipolar disorder is the likely presence of etiologic heterogeneity. One approach to overcoming this challenge is to search for markers of genetic subtypes of bipolar disorder. For example, MacKinnon et al. (1998), after identifying a strong tendency for a subtype of bipolar disorder with concomitant panic disorder to run in families, found strong evidence for linkage of the panic-disorder subtype to marker genes on a region of chromosome 18. There was no evidence of such linkage for bipolar disorder without panic disorder. Other research has found that manic-depressive patients who respond well to lithium treatment represent a subtype that is etiologically more homogeneous, and has a stronger familial tendency, than non-lithium responders. Both linkage and association studies suggest that susceptibility to this lithium-
responsive subtype is increased by certain alleles of the gene for phospholipidase C, an enzyme important in the phosphoinositol cycle that is thought to be a therapeutic target of lithium (Alda 1999).
4. Genetic Counseling A recent study (Trippitelli et al. 1998) found that most patients with bipolar disorder and their unaffected spouses would be interested in receiving counseling about their own genetic risk, and that of their children. The majority of patients and spouses, for example, would take advantage of a test for susceptibility genes. Even when precise gene(s) involved in a particular case of schizophrenia or bipolar disorder are unknown, genetic counselors can provide a patient’s relatives with an estimated risk of developing the disorder, based on average figures from many different family studies. These risk figures, it should be noted, refer to a person’s lifetime risk—an important distinction (e.g., for schizophrenia, one’s risk has been cut roughly in half by age 30, and by age 50 it is extremely small). It is crucial that counseling be based on accurate differential diagnosis. Problems often encountered in counseling, such as counselees being confused by genetic information, or feeling fearful and embarrassed, may be heightened in families with bipolar disorder and schizophrenia, because these psychiatric disorders often carry a social stigma, and because many parents have (unfairly) been blamed for their child’s illness. For these reasons, Kessler (1980) emphasized the importance of (a) careful follow-up, to make sure that counseling was understood, and (b) the inclusion of professionals with good psychotherapeutic skills as part of the counseling team.
5. Creatiity and Liability for Schizophrenia and Bipolar Disorder There is increasing evidence, from converging lines of research, for the idea that genetic liability for schizophrenia and bipolar disorder is associated with unusual creative potential. This idea, which has long been the subject for theoretical speculation, has received empirical support from several complementary types of studies. For example, a number of studies involving non-clinical samples have reported that more creative subjects tend to score higher on personality test variables that are associated with liability for schizophrenia or bipolar disorder.
5.1 Creatiity in Schizophrenics’ Biological Relaties Of even greater interest are studies that have found unusual creativity in samples of the healthier bio13541
Schizophrenia and Bipolar Disorder: Genetic Aspects logical relatives of diagnosed schizophrenics. In an Icelandic sample, for example, Karlsson (1970) found that the biological relatives of schizophrenics were significantly more likely than people in the general population to be recognized in Who’s Who for their work in creative professions. In the adoption study noted earlier, Heston (1966) serendipitously discovered that, among the psychologically healthier ‘index’ adoptees (i.e., who had a biological mother with schizophrenia), there was a subgroup of psychologically healthy individuals who had more creative jobs and hobbies than the control adoptees. Using a similar research design, Kinney et al. (2000) studied the adopted-away offspring of schizophrenic and control biological parents. The creativity of these adoptees’ actual vocational and avocational activities was rated by investigators who were blind to the adoptees’ personal and family histories of psychopathology. Real-life creativity was rated as significantly higher, on average, for subjects who, while not schizophrenic, did have signs of magical thinking, recurrent illusions or odd speech. 5.2 Creatiity in Patients with Bipolar Disorder and Their Relaties There is also evidence for increased creativity among subjects with major mood disorders and their biological relatives. Studies of eminent writers, artists, and composers carried out in the USA, the UK, and France, all found significantly higher rates of major mood disorders among these creators than among the general population (e.g., see Jamison 1990). Richards et al. (1988) extended this link by showing that measures of ‘everyday,’ or non-eminent, creativity were significantly higher, on average, in manic-depressive and cyclothymic subjects and their normal relatives than in control subjects who did not have a personal or family history of mood disorders. Moreover, both creative artists and patients with mood disorders report that their creativity is significantly enhanced during periods of moderately elevated mood (e.g., Richards and Kinney 1990). These complementary findings suggest that the association between increased creative potential and genetic liability for bipolar disorder may extend not only to the millions of people with bipolar disorder, but also to tens of millions of others who, while not ill themselves, may carry genes for the disorder. 5.3 Implications of Link between Creatiity and Genes for Schizophrenia and Bipolar Disorder It is important to determine what maintains the high prevalence of genes for bipolar disorder in the population, despite the high rates of illness and death that are associated with this disorder. One interesting possibility is that genes which increase liability for 13542
bipolar disorder may also be associated with personally and socially beneficial effects, such as increased drive and creativity. The research findings suggesting an association between genes for bipolar disorder and increased creativity are also potentially of great significance in terms of how patients and their families view greater liability for bipolar disorder, as well as for combatting the social stigma that is still often attached to the disorder. Parallel considerations apply in the case of schizophrenia, but perhaps with even greater force, because schizophrenia tends to be an even more chronic and disabling disease, have even lower fertility, and carry an even greater social stigma. As rapid advances in molecular biology and discovery of genetic markers make it possible to detect major genes for schizophrenia (and to identify individuals who carry these genes), it will become increasingly important to know whether such genes are associated with positie, as well as negative, behavioral phenotypes or outcomes—and to understand what genetic and\or environmental modifiers affect how these genes are expressed. See also: Behavioral Genetics: Psychological Perspectives; Bipolar Disorder (Including Hypomania and Mania); Depression; Genetic Screening for Disease-related Characteristics; Genetics and Development; Genetics of Complex Traits Through the Life Cycle; Intelligence, Genetics of: Cognitive Abilities; Intelligence, Genetics of: Heritability and Causation; Mental Illness, Etiology of; Mental Illness, Genetics of; Personality Disorders; Schizophrenia, Treatment of
Bibliography Alda A 1999 Pharmacogenetics of lithium response in bipolar disorder. J. Psychiatry Neurosci. 24(2): 154–8 American Psychiatric Association 1994 Diagnostic and Statistical Manual of Mental Disorders (DSM-IV), 4th edn. American Psychiatric Association, Washington, DC Gershon E S, Badner J A, Goldin L R, Sanders A R, Cravchik A, Detera-Wadleigh S D 1998 Closing in on genes for manicdepressive illness and schizophrenia. Neuropsychopharmacology 18: 233–42 Gottesman I I, McGuffin P, Farmer A E 1987 Clinical genetics as clues to the ‘real’ genetics of schizophrenia. Schizophrenia Bulletin 13: 23–48 Heston L L 1966 Psychiatric disorders in foster home reared children of schizophrenic mothers. British Journal of Psychiatry 112: 819–25 Holzman P S, Matthysse S 1990 The genetics of schizophrenia: A review. Psychological Science 1(5): 279–86 Jamison K R 1990 Manic-depressive illness and accomplishment: Creativity, leadership, and social class. In: Goodwin F, Jamison K R (eds.) Manic-depressie Illness. Oxford University Press, Oxford, UK Karlsson J L 1970 Genetic association of giftedness and creativity with schizophrenia. Hereditas 66: 177–81 Kendler K S, Gruenberg A M, Kinney D K 1994 Independent diagnoses of adoptees and relatives, using DSM-III criteria, in
Schizophrenia: Neuroscience Perspectie the provincial and national samples of the Danish adoption study of schizophrenia. Archies of General Psychiatry 51: 456–68 Kessler S 1980 The genetics of schizophrenia: a review. Schizophrenia Bulletin 6: 404–16 Kety S S, Wender P H, Jacobsen B, Ingraham L J, Jansson L, Faber B, Kinney D K 1994 Mental illness in the biological and adoptive relatives of schizophrenic adoptees. Replication of the Copenhagen study in the rest of Denmark. Archies of General Psychiatry 51: 442–55 Kinney D K, Richards R L, Lowing P A, LeBlanc D, Zimbalist M A, Harlan P 2000 Creativity in offspring of schizophrenics and controls: An adoption study. Journal of Creatiity Research 13(1): 17–25 Matthysse S, Kidd K K 1976 Estimating the genetic contribution to schizophrenia. American Journal of Psychiatry 133: 185–91 Maziade M, Martinez M, Cliche D, Fournier J P, Garneau Y, Merette C 1994 Linkage on the 11q21–22 region in a severe form of schizophrenia. American Psychiatric Association Annual Meeting, New Research Programs and Abstracts, p. 97 McGuffin P 1989 Genetic markers: an overview and future perspectives. In: Smeraldi E, Belloni L (eds.) A Genetic Perspectie for Schizophrenic and Related Disorders. EdiErmes, Milan Mendlewicz J, Ranier J D 1977 Adoption study supporting genetic transmission of manic-depressive illness. Journal of the American Medical Association 222: 1624–7 Murray R M, Takei N, Sham P, O’Callaghan E, Wright P 1993 Prenatal influenza, genetic susceptibility and schizophrenia. Schizophrenia Research 9(2,3): 137 Pekkarinen P 1998 Genetics of bipolar disorder. Psychiatria Fennica 29: 89–109 Richards R L, Kinney D K 1990 Mood swings and everyday creativity. Creatiity Research Journal 3: 202–17 Torrey E F 1987 Prevalence studies in schizophrenia. British Journal of Psychiatry 150: 598–608 Torrey E F, Bowler A E, Taylor E H, Gottesman, I I 1994 Schizophrenia and Manic-depressie Disorder. Basic Books, New York Trippitelli C L, Jamison K R, Folstein M R, Bartko J J, DePaulo J P 1998 Pilot study on patients’ and spouses’ attitudes toward potential genetic testing for bipolar disorder. American Journal of Psychiatry 155(7): 899–904 Vehmanen L, Katrio J, Lonnqvist J 1995 Twin studies on concordance for bipolar disorder. Clinical Psychiatry and Psychopathology 26: 107–16 Wyatt R J 1991 Neuroleptics and the natural course of schizophrenia. Schizophrenia Bulletin 17: 325–51 Mackinnon D F, Xu J, McMahon F J, Simpson S G, Stine O C, McInnis M G, De Paulo J R 1998 Bipolar disorder and panic disorder in families: an analysis of chromosome 18 data. American Journal of Psychiatry 155(6): 829–31 Kinney D K, Yurgelun-Todd D A, Tramer S J, Holzman P S 1998 Inverse relationship of perinatal complications with eyetracking dysfunction in relatives of patients with schizophrenia: evidence for a two-factor model. American Journal of Psychiatry 155(7): 976–8 Richards R, Kinney D K, Lunde I, Benet M, Merzel A P C 1988 Creativity in manic-depressives, cyclothymes, their normal relatives and control subjects. Journal of Abnormal Psychology. 97(3): 1–8
D. K. Kinney
Schizophrenia: Neuroscience Perspective Schizophrenia is a complex mental illness characterized by acute phases of delusions, hallucinations, and thought disorder, and chronically by apathy, flat affect, and social withdrawal. Schizophrenia affects 1 percent of the world’s population, independent of country or culture, and constitutes a severe public health issue (WHO 1975). Schizophrenic patients normally start to display symptoms in their late teens to early twenties, but the time of onset and the course of the illness are very variable (American Psychiatry Association 1987). Approximately a third of patients experience one acute episode after which they make a more or less full recovery. Another third are affected by the illness throughout their lives, but their symptoms are to some extent alleviated by anti-psychotic drugs. The remaining third are so chronically ill that they show little or no improvement, even with medication (Johnstone 1991).
1. Intellectual Function There is a striking deterioration in intellectual and cognitive function in schizophrenia (e.g., Johnstone 1991). Patients have difficulty in initiating and completing everyday tasks, being distracted easily and tending to give up when confronted by any obstacles. These deficits are similar to the problems in initiation and planning associated with frontal-lobe lesions (Shallice and Burgess 1991). This has led many researchers to suggest that the core deficit of schizophrenia is a failure to activate frontal cortex appropriately during cognitive tasks involving planning and decision making (‘task-related hypofrontality’), a notion supported in part by many functional imaging studies (see below). Schizophrenic patients also show deficits in attention and memory tasks that engage prefrontal, hippocampal, and medial temporal systems, even in drug-free populations (Saykin et al. 1994). In the rest of this entry we shall consider the direct evidence for brain abnormalities in schizophrenia.
2. Post-mortem Neuropathology Studies of post-mortem brains (Harrison 1999) have observed a decrease in overall brain size and have linked schizophrenia to structural abnormalities in the prefrontal cortex and temporal lobe, especially the hippocampus and amygdala. Several studies have found that schizophrenic brains tend to have enlarged lateral ventricles compared to nonschizophrenic brains. Histological studies have shown evidence for 13543
Schizophrenia: Neuroscience Perspectie abnormal synaptic appearance in the cingulate and hippocampal pyramidal cells in schizophrenia. The most reproducible positive anatomical finding in postmortem hippocampal formation has been the reduced size of neuronal cell bodies in schizophrenia. However, these changes have not been found in all studies.
3. In Vio Studies Of Brain Structure 3.1 Computerized Tomography (CT) Studies The main finding from CT scan studies is that the lateral ventricles are enlarged in schizophrenic patients compared to normal controls (Van Horn and McManus 1992). Although the finding of increased ventricular volume in schizophrenic patients is widespread and replicated, the difference between schizophrenic patients and normal controls is small. Overlap with the normal population is appreciable. Correlation between CT findings and symptoms has been investigated (Lewis 1990). Enlarged ventricles have been associated with chronicity of illness, poor treatment response, and neuropsychological impairment in many, but not all of these studies. 3.2 Magnetic Resonance Imaging (MRI) Studies MRI studies (see Harrison 1999) have tended to confirm the finding of enlarged ventricles in schizophrenic patients, but also permit a more detailed analysis of brain structure. Temporal lobe reductions have been reported, and are especially prominent in the hippocampus, parahippocampal gyrus, and the amygdala, but have not been observed in all studies. Robust relationships between temporal lobe reductions and clinical features have not yet been found. Some studies have found frontal lobe reductions in schizophrenic patients. Reversal or reduction of normal structural cerebral asymmetries may be related to the pathogenesis of schizophrenia (Crow 1995). Various unusual symmetries have been observed consistent with the hypothesis that failure to develop normal asymmetry is an important component of the pathology underlying some forms of schizophrenia. There are some MRI data that provide support for a hypothesis of disconnection between brain areas in schizophrenia. These results support the existence of a relative ‘fronto-temporal dissociation’ in schizophrenia. Evidence for such dissociation has also been obtained in functional imaging studies (see below). All these studies used a ‘regions of interest’ approach in which measurements were restricted to prespecified brain regions. More recently, techniques have been developed in which differences can be detected automatically throughout the brain. Using such techniques, Andreasen et al. (1994) observed decreased thalamus size in schizophrenic patients consistent with observations in post-mortem brains. 13544
4. Functional Imaging 4.1 Resting Studies Using Positron Emission Tomography (PET) More sensitive measures of brain integrity can be obtained by measuring cerebral blood flow in a patient at rest. However, these results are difficult to interpret because the pattern of blood flow may be altered by the current mental state of the patient and by medication. Early studies of this sort observed a relative reduction of blood flow in the frontal lobes of patients with schizophrenia (Ingvar and Franzen 1974). This pattern of activity became known as hypofrontality. However, subsequent studies have not always replicated this observation. Several studies have looked at clinical correlates associated with hypofrontality, but the results are inconsistent. Among features showing a positive relationship with hypofrontality are chronicity, negative symptoms, and neuropsychological task impairment. Relations have also been observed between the pattern of blood flow and the symptomatology of patients at the time of scanning. For example, patients manifesting ‘psychomotor poverty’ showed reduced blood flow in dorsolateral prefrontal cortex (Liddle et al. 1992). 4.2 The Dopamine Hypothesis One of the most robust findings in schizophrenia research has been the observation that drugs which block dopamine receptors are effective in reducing the severity of symptoms such as hallucinations and delusions (Seeman 1986). This led to the dopamine (DA) hypothesis of schizophrenia, which posits that schizophrenia is caused by an overactivity of dopamine receptors (Van Rossum 1966). The best way to investigate the dopamine hypothesis is the in io visualization of radioactive ligand binding to quantify dopamine receptor densities in drug-naive patients using PET. These studies suggest that compared to healthy controls, patients with schizophrenia show a significant but mild increase in, and a larger variability of, D2 receptor density (Laruelle 1998). There is also evidence that D1 receptor density is reduced in the prefrontal cortex of schizophrenic patients (Okubo et al. 1997).
5. Cognitie Actiation Studies Functional neuroimaging experiments generally evaluate brain activity associated with performance of cognitive or sensori-motor tasks. Cognitive activation studies have provided further evidence for decreased frontal activity (‘hypofrontality’) in schizophrenia. In addition, there is increasing evidence that schizophrenics show abnormal integration between the
Schizophrenia: Neuroscience Perspectie frontal cortex and other brain regions, including the temporal lobes, the parietal lobes, and hippocampus, during cognitive tasks. 5.1 Task-based Studies of Executie Function Schizophrenia is characterized largely by impairments in planning and execution, and therefore tasks that involve this kind of planning and modification of behavior have been exploited in the scanner. Several studies have found that schizophrenic patients show reduced activity in the dorsolateral prefrontal cortex (DLPFC) while performing the Wisconsin card sorting task, a popular test of planning. In addition to decreased DLPFC activity, schizophrenic patients also showed abnormal responses in the temporal lobes and parahippocampal gyrus (Ragland et al. 1998). The results suggest that schizophrenia may involve a breakdown in the integration between the frontal and temporal cortex, which is necessary for executive and planning demands in healthy individuals. This interpretation moves away from the simple notion that dysfunction in isolated brain regions, explains the cognitive deficits in schizophrenia, and towards the idea that neural abnormality in schizophrenia reflects a disruption of integration between brain areas. 5.2 Willed Action Willed actions are self-generated in the sense that the subject makes a deliberate and free choice to perform one action rather than another. Willed actions are a fundamental component of executive tasks. In normal subjects, willed actions are associated with increased blood flow in the DLPFC. Schizophrenic patients, especially those with negative signs, have difficulty with tasks involving free choices, and show an associated lack of activity in DLPFC. Activity in this region normalizes as the symptoms decrease (Spence et al. 1998). This suggests that hypofrontality depends on current symptoms. Studies of willed action also suggest that the underactivity in DLPFC observed in some schizophrenic patients is accompanied by overactivity in posterior brain regions. There is evidence of a lack of the normal reciprocal interaction between the frontal and the superior temporal cortex in schizophrenia, which supports the notion of impaired functional integration (McGuire and Frith 1996). 5.3 Memory Tasks Memory impairments are an especially enduring feature of schizophrenia. Functional neuroimaging studies of memory have demonstrated hypofrontality, abnormal interaction between temporal and frontal cortex, and a dysfunctional cortico-cerebellar circuit in schizophrenic patients compared to control subjects.
The hippocampus is a brain structure that is well known to be involved in memory. Evidence for impaired hippocampal function in schizophrenia was found in a well-controlled functional imaging study (Heckers et al. 1998). In this study, schizophrenic patients failed to recruit the hippocampus during successful retrieval, unlike normal control subjects. The schizophrenic patients also showed a more widespread activation of prefrontal areas and parietal cortex during recollection than did controls. The authors propose that this overactivation represents an ‘effort to compensate for the failed recruitment of the hippocampus.’ This result supports the idea that neural abnormality in schizophrenia reflects a disruption of integration between brain areas. Fletcher et al. (1999) also found evidence for abnormal integration between brain areas in schizophrenia during the performance of a memory task. They demonstrated an abnormality in the way in which left prefrontal cortex influenced activity in left superior temporal cortex, and suggested that this abnormality was due to a failure of the anterior cingulate cortex to modulate the prefronto-temporal relationship.
6. Imaging Symptoms Functional neuroimaging is also useful for evaluating neural activity in patients experiencing specific psychotic symptoms, such as hallucinations and passivity. 6.1 Hallucinations Hallucinations, perceptions in the absence of external stimuli, are prominent among symptoms of schizophrenia. Functional neuroimaging studies of auditory hallucinations suggest that they involve neural systems dedicated to auditory speech processing as well as a distributed network of other cortical and subcortical areas (Dierks et al. 1999). There is also evidence that the activity associated with auditory hallucinations resembles that seen when normal subjects are using inner speech (McGuire et al. 1996). 6.2 Passiity Passivity symptoms, or delusions of control, in which patients claim that their actions and speech are being controlled by an external agent, are common in schizophrenia. Schizophrenic patients with passivity showed hyperactivation of inferior parietal lobe (BA 40), the cerebellum, and the cingulate cortex relative to schizophrenic patients without passivity and to normal controls. When patients no longer experienced passivity symptoms, a reversal of the hyperactivation of parietal lobe and cingulate was seen (Spence et al. 1997). Hyperactivity in parietal cortex may reflect the ‘unexpected’ nature of the 13545
Schizophrenia: Neuroscience Perspectie movement experienced by patients. The movement feels as though it is being caused by an external force (Frith et al. 2000).
7. Conclusions There is considerable evidence for structural abnormalities in the brains of patients with schizophrenia, but the abnormalities identified so far are not specific to this disorder, are very variable and cannot easily be related to the symptoms. The neurotransmitter dopamine is clearly important in schizophrenia, but its precise role remains unclear. Studies of brain function are still at an early stage, but suggest that schizophrenia may be characterized by disorders of connectivity between cortical and subcortical regions. Some of the symptoms of schizophrenia can be understood in terms of these disconnections. Current advances in imaging techniques aimed at measuring connectivity in the brain are likely to have a major impact on our understanding of schizophrenia. See also: Mental Illness, Epidemiology of; Mental Illness, Etiology of; Mental Illness, Genetics of; Psychiatric Assessment: Negative Symptoms; Schizophrenia; Schizophrenia and Bipolar Disorder: Genetic Aspects; Schizophrenia, Treatment of
Bibliography American Psychiatry Association 1987 Diagnostic and Statistical Manual of Mental Disorders (DSM-III-R), 3rd edn. American Psychiatry Association, Washington, DC Andreasen N C, Arndt S, Swayze V II, Cizadlo T, Flaum M, O’Leary D S, Ehrhardt J C, Yuh W T 1994 Thalamic abnormalities in schizophrenia visualised through magnetic resonance image averaging. Science 266: 294–8 Crow T J 1995 Aetiology of schizophrenia: An evolutionary theory. International Clinical Psychopharmacology 10 (Suppl. 3): 49–56 Dierks T, Linden D E J, Jandi M, Formisano E, Goebel R, Lanfermann H, Singer W 1999 Activation of Heschl’s gyrus during auditory hallucinations. Neuron 22: 615–21 Fletcher P C, McKenna P J, Friston K J, Frith C D, Dolan R J 1999 Abnormal cingulate modulation of fronto-temporal connectivity in schizophrenia. NeuroImage 9: 337–42 Frith C D, Blakemore S-J, Wolpert D M 2000 Explaining the symptoms of schizophrenia: Abnormalities in the awareness of action. Brain Research Reiews 31: 357–63 Harrison P J 1999 The neuropathology of schizophrenia. Brain 122: 593–624 Heckers S, Rauch S L, Goff D, Savage C R, Schacter D R, Fischmaer A J, Alpert N A 1998 Impaired recruitment of the hippocampus during conscious recollection in schizophrenia. Nature Neuroscience 4: 318–23 Ingvar D H, Franzen G 1974 Distribution of cerebral activity in chronic schizophrenia. Lancet 2: 1484–86 Johnstone E C 1991 Defining characteristics of schizophrenia. British Journal of Psychiatry Supplement 13: 5–6 Laruelle M 1998 Imaging dopamine transmission in schizophrenia. A review and meta-analysis. Quarterly Journal of Nuclear Medicine 42: 211–21
13546
Lewis S W 1990 Computerised tomography in schizophrenia 15 years on. British Journal of Psychiatry Supplement. 9: 16–24 Liddle P F, Friston K J, Frith C D, Frackowiak R S 1992 Cerebral blood flow and mental processes in schizophrenia. Journal of the Royal Society of Medicine 85: 224–7 McGuire P K, Frith C D 1996 Disordered functional connectivity in schizophrenia. Psychological Medicine 26: 663–7 McGuire P K, Silbersweig D A, Wright I, Murray R M, Frackowiak R S, Frith C D 1996 The neural correlates of inner speech and auditory verbal imagery in schizophrenia: Relationship to auditory verbal hallucinations. British Journal of Psychiatry 169: 148–59 Okubo Y, Suhara T, Suzuki K, Kobayashi K, Inoue O, Terasaki O, Someya Y, Sassa T, Sudo Y, Matsushima E, Iyo M, Tateno Y, Toru M 1997 Decreased prefrontal dopamine D1 receptors in schizophrenia revealed by PET. Nature 385: 634–6 Ragland J D, Gur R C, Glahn D C, Censits D M, Smith R J, Lazarev M G, Alavi A, Gur R E 1998 Frontotemporal cerebral blood flow change during executive and declarative memory tasks in schizophrenia: A positron emission tomography study. Neuropsychology 12: 399–413 Saykin A J, Shtasel D L, Gur R E, Kester D B, Mozley L H, Stafiniak P, Gur R C 1994 Neuropsychological deficits in neuroleptic naive patients with first-episode schizophrenia. Archies of General Psychiatry 51: 124–31 Seeman P 1986 Dopamine\neuroleptic receptors in schizophrenia. In: Burrows G D, Norman T R, Rubenstein G (eds.) Handbook on Studies of Schizophrenia, Part 2. Elsevier, Amsterdam Shallice T, Burgess P W 1991 Deficits in strategy application following frontal lobe damage in man. Brain 114: 727–41 Spence S A, Brooks D J, Hirsch S R, Liddle P F, Meehan J, Grasby P M 1997 A PET study of voluntary movement in schizophrenic patients experiencing passivity phenomena (delusions of alien control). Brain 120: 1997–2011 Spence S A, Hirsch S R, Brooks D J, Grasby P M 1998 Prefrontal cortex activity in people with schizophrenia and control subjects. Evidence from positron emission tomography for remission of ‘hypofrontality’ with recovery from acute schizophrenia. British Journal of Psychiatry 172: 316–23 Van Horn J D, McManus I C 1992 Ventricular enlargement in schizophrenia. A meta-analysis of studies of the ventricle:brain ratio. British Journal of Psychiatry 160: 687–97 Van Rossum J M 1966 The significance of dopamine receptor blockade for the mechanism of action of neuroleptic drugs. Archies Internationales de Pharmacodynamie et de Therapie 160: 492–94 World Health Organization 1975 Schizophrenia: A Multinational Study. WHO, Geneva, Switzerland
C. D. Frith and S-J Blakemore
Schizophrenia, Treatment of Schizophrenia is a brain disorder of unknown origin which severely deteriorates numerous complex functions of the central nervous system (CNS) including thought, emotion, perception, cognition, and behavior. Schizophrenia is one of the most debilitating psychiatric disorders. Due to the high lifetime prevalence (about 1.5 percent), the typical onset in early adulthood, and the strong tendency towards a chronic
Schizophrenia, Treatment of course, the disorder requires a very high degree of health care provisions. About one-quarter of hospital beds are occupied by schizophrenic patients, and the total costs of treatment are enormous (e.g., US$50 billion per year in the United States). Although there is no cure for schizophrenia, the combined administration of pharmacological and psychosocial interventions considerably improves outcome, enhances quality of life in the patients affected, and enables social integration to a large extent.
1. Causes and Pathophysiology of Schizophrenia The causes of schizophrenia are unknown so far. Therefore, the term ‘schizophrenia’ refers to an empirically defined syndrome characterized by a combination of certain symptoms which occur in a particular temporal pattern. The idea that schizophrenia is a distinct brain disorder is rooted in Emil Kraepelin’s concept of dementia praecox (Kraepelin 1893). This concept emphasized one particular aspect of the disorder: the onset of persistent cognitive disturbances early in life. The term ‘schizophrenia’ was coined by Eugen Bleuler (1916), who wanted to emphasize the loss of coherence between thought, emotion, and behavior which represents another important feature of the disorder. Bleuler actually spoke about ‘schizophrenias,’ implying a group of diseases rather than one distinct disease entity. The present diagnostic systems compiled by the World Health Organization (ICD-10 1992) and the American Psychiatric Association (DSM-IV 1994) distinguish various subtypes of schizophrenia (see Table 1) which are classified according to particular symptom combinations, and according to certain aspects of the course and prognosis of the disease. However, these subtypes are defined in a phenomenological manner which neither implies distinct causes for any of the subtypes nor any particular treatment. Although we do not know individual causes of schizophrenia, we know that the basis for the disorder is an interaction between genetic susceptibility factors and environmental components. According to numerous genetic studies the estimates of hereditability converge to about 80 percent (Owen and Cardno 1999). However, although linkage studies have yielded
evidence for a number of susceptibility loci on various chromosomes, neither a single gene nor a combination of genes have been discovered so far which are definitively involved. Similarly, a number of environmental factors (e.g., birth and pregnancy complications, viral infections) are likely to play a role, but details of this role remain to be established. Although the symptoms of schizophrenia clearly demonstrate a severe disturbance of brain functions, only subtle alterations of brain morphology have been detected. The most stable finding is a slight enlargement of brain ventricles. Recent sophisticated studies combining morphological and functional brain imaging have been suggested to indicate that subtle structural or functional lesions particularly in prefrontal and limbic neural circuits impair the integrity of these circuits (Andreasen et al. 1998). Although it is still a matter of debate whether these lesions are due to a neurodevelopmental or a neurodegenerative process (Lieberman 1999), it is clear that disturbed functioning of neuronal circuits goes along with altered neurotransmission. Dopamine, serotonin, and glutamate are the neurotransmitters most frequently implicated in the pathophysiology of schizophrenia. However, it is still not clear whether disturbed neurotransmission is the cause or the consequence of the major disease process. Due to this scantiness of etiological and pathophysiological knowledge, the treatment of patients suffering from schizophrenia is based exclusively on empirical clinical knowledge. As outlined in detail below, the treatment approach is multimodal and based on the symptom patterns present in any individual patient (see Mental Illness, Etiology of ).
2. Symptoms of Schizophrenia—the Targets for Treatment Schizophrenia affects almost all areas of complex brain functions, and thought, perception, emotion, cognition, and behavior in particular. There have been numerous approaches to systematize the plethora of symptoms according to a variety of theoretical concepts. One widely accepted concept is the distinction between positive and negative symptoms (Andreasen
Table 1 Subtypes of schizophrenia according to the major classification systems ICD-10 Paranoid schizophrenia Catatonic schizophrenia Undifferentiated schizophrenia Residual schizophrenia Hebephrenic schizophrenia — Simple schizophrenia
[DSM-IV] Schizophrenia, paranoid type Schizophrenia, catatonic type Schizophrenia, undifferentiated type Schizophrenia, residual type — Schizophrenia, disorganized type —
13547
Schizophrenia, Treatment of Table 2 Positive and negative symptoms of schizophrenia. Areas of symptoms are listed and selected examples appear in parentheses Positive symptoms Hallucinations (auditory, somato-tactile, isual, olfactory) Delusions ( persecutory, religious, thought broadcasting, thought insertion) Bizarre behavior (unusual clothing or appearance, repetitie, stereotyped behaior) Positive formal thought disorders (incoherence, illogicality, tangentiality) Negative symptoms Affective flattening (unchanging facial expression, paucity of expressie gestures) Alogia ( poerty of speech, increased response latency) Avolition-apathy ( physical anergia, problems with personal hygiene) Anhedonia-asociality (reduction in social actiities, sexual interest, closeness) Attention (social inattentieness, inattentieness during testing)
et al. 1995, Table 2) (see Psychiatric Assessment: Negatie Symptoms). This approach groups symptoms according to whether they represent a loss of or a deficiency in a normal brain function (negative symptoms) or the appearance of abnormal phenomena (positive symptoms). From the perspective of treatment the positive–negative dichotomy is of importance because these two domains react somewhat differently to treatment. In general, positive symptoms respond fairly well to treatment with antipsychotic drugs, whereas negative symptoms are rather difficult to influence. The latter is particularly problematic, because negative symptoms are usually the limiting factor for personal and social rehabilitation. In addition to the typical symptoms, patients suffering from schizophrenia often present numerous other psychiatric problems. Among these, anxiety, sleep disturbances, obsessions and compulsions, depression and drug or substance abuse are particularly frequent. Although these additional symptoms and problems may in many cases be the consequence of the typical symptoms rather than signs of independent additional psychiatric disorders, they complicate the course of the disease and often necessitate specific treatment approaches. The course of schizophrenia varies considerably between patients. Typically, the disease begins during early adulthood with the appearance of rather unspecific negative symptoms resulting in social withdrawal and impaired social and scholastic adjustment. Months to years later positive symptoms such as delusions and hallucinations appear either gradually or abruptly. Across time, positive symptoms tend to show an episodic course, whereas negative symptoms either remain quite stable or even progress. The associated psychiatric problems mentioned above do not show any systematic temporal relationship to the course of schizophrenia itself. Since the syndrome of schizophrenia has been defined, data on the final outcome have varied considerably, depending mainly on the diagnostic concepts applied. There is no doubt that full remission occurs, but probably in less than 20 13548
percent of patients. In the majority of patients schizophrenia takes a chronic course (Schultz and Andreasen 1999). The major benefit from modern treatment strategies is probably not a substantial increase in the rate of full remissions, but a significant reduction in the number of extremely ill patients, and in the severity and number of episodes characterized by prominent positive symptoms.
3. The Principles of Treatment Treatment of schizophrenia is usually multimodal and comprises approaches from two major areas, which are drug treatment and psychosocial interventions. In general, appropriate drug treatment is a prerequisite for the ability of the patients to comply with and actively take part in psychosocial treatments. The more effective drug treatment is, the more specialized and sophisticated psychosocial interventions can be successfully applied. Vice versa, appropriate psychosocial treatment considerably improves the compliance with drug treatment, because it enhances insight into the disease process, which initially is poor in many patients suffering from schizophrenia. Antipsychotic drugs (also called neuroleptics) are the most important and most effective therapeutic weapon. The major targets of these drugs are positive symptoms, although newer substances might also reduce negative symptoms to some extent (see below). Treatment with antipsychotic drugs is usually a longterm treatment, whereas drugs to control accessory symptoms (anxiolytics, hypnotics, and antidepressants) are prescribed intermittently when needed. The use of electroconvulsive therapy was widespread before antipsychotic drugs were available, but today is very limited, although this treatment is effective in certain conditions (Fink and Sackeim 1996). Among the psychosocial interventions, supportive and psychoeducative approaches are feasible for the majority of patients, whereas structured social skill training, family therapy, or complex programs in-
Schizophrenia, Treatment of cluding cognitive-behavioral therapies require a considerable degree of insight and compliance. Although recent psychosocial treatment approaches incorporate also psychodynamic aspects, classical psychodynamic psychotherapy, in general, is not effective and sometimes even counterproductive. Prior to the discovery of antipsychotic drugs, most schizophrenic patients had been hospitalized for decades or even for their entire life. Today, very differentiated treatment facilities are available, which to a large extent are community based. These include, besides classical hospitals and outpatient departments, short-term crisis intervention facilities, day-time clinics, and supported living facilities. This network of facilities considerably increases the chance for social integration and reduces the time spent overall in hospitals. However, recent aggressive approaches to shorten the duration of hospital stays dramatically might lead to a high frequency of rehospitalization, poor long-term outcome, and an overall increase in the costs of treatment. An important limiting factor for all treatment approaches is the patients’ compliance. In particular in severely ill patients compliance is often poor. In schizophrenia noncompliance is particularly due to a lack of insight into the fact of being ill, reduced ability to actively take part in the treatment process due to negative symptoms, or active avoidance of treatment due to positive symptoms such as delusions or imperative voices. To enable stable treatment compliance in patients suffering from schizophrenia, a confidential and empathetic attitude of the people involved is an important prerequisite.
4. Particular Aspects of Treatment 4.1 Pharmacotherapy The introduction of chlorpromazine into clinical practice in 1952 is the hallmark of the pharmacological treatment of schizophrenia (Delay and Deniker 1952). Chemically, chlorpromazine is a phenothiazine and other substances of this group also possess antipsychotic properties. In the 1950s and 1960s, numerous antipsychotic drugs with different structures (in particular butyrophenones [e.g., haloperidol] and thio-
xanthenes [e.g., flupentixol]) were developed. Despite this diversity in chemical structure, these first-generation drugs do not differ qualitatively with respect to their profile of beneficial and undesired effects. They target mainly the positive symptoms of schizophrenia mentioned in Table 2 without preferentially affecting one or the other of these symptoms. Their effectiveness does not depend on the subtype of schizophrenia (see Table 1). The common mode of action of the first-generation antipsychotics is believed to be blockade of the D2 subtype of dopamine receptors (Pickar 1995). Blockade of these receptors in the mesolimbic dopamine system is thought to cause a reduction in positive symptoms, and blockade in the nigrostriatal dopamine system is believed to be responsible for the typical side effects of these drugs. These are severe disturbances of motor behavior caused by a drug-induced dysfunction of the dopaminergic extrapyramidal system, which plays a pivotal role in the control of movements. Table 3 summarizes these extrapyramidal syndromes. Those which occur acutely are often very disturbing, but respond fairly well to treatment (e.g., acute dystonia to anticholinergic drugs), or at least cease upon drug discontinuation or dose reduction. Tardive extrapyramidal syndromes, however, are frequent (5–10 percent on long-term treatment with first-generation antipsychotics), often resistant to treatment approaches, and they usually persist if the antipsychotic drug is stopped. Another severe side effect of unknown cause, which might be related to the extrapyramidal system, is the neuroleptic malignant syndrome (Pelonero et al. 1998; see also Sect. 4.3). Extrapyramidal side effects are common to all first-generation drugs, whereas other, less specific side effects (e.g., sedation, weight gain, postural hypotension) occur with some but not all substances (see Psychopharmacotherapy: Side Effects). Due to the central role attributed to D2 dopamine receptor blockade, it has been thought for a long time that antipsychotic effectiveness and extrapyramidal syndromes are inevitably linked to each other. However, in the late 1960s it turned out that the dibenzodiazepine clozapine is a potent antipsychotic drug that almost never induces extrapyramidal symptoms and is effective in a proportion of otherwise treatmentresistant patients (Angst et al. 1971). Moreover, it was
Table 3 Extrapyramidal syndromes induced by first-generation antipsychotic drugs. Acute (within days) Drug-induced Parkinsonism (bradykinesia, increased muscular tone, tremor) Dystonia (sudden-onset, sustained muscular contractions preferably in the cephalic musculature) Akathisia (internal restlessness and urge to moe, often coupled with pacing behaior)
[Late or tardive (within months to years)] Tardive dyskinesia or dystonia (irregular choreiform, or dystonic moements in any oluntary muscle group) Tardive tics (repetitie, short-lasting, stereotyped moements or other kind of behaior [e.g., ocalization]) Tardive myoclonus (brief asynchronous moements which are not stereotyped)
13549
Schizophrenia, Treatment of shown that clozapine not only improves positive but to some extent also negative symptoms. Therefore, the introduction of clozapine into clinical practice was the first qualitative advance after the discovery of chlorpromazine, and it is justified to call this substance a second-generation antipsychotic drug. Unfortunately it turned out quickly that clozapine has a very particular and important drawback: it induces lifethreatening agranulocytosis in about one percent of patients treated (Baldessarini and Frankenburg 1991). Therefore, the use of this drug is only possible when white blood cell counts are monitored in short intervals. It is still not known why clozapine is a potent antipsychotic drug, but lacks extrapyramidal side effects. Among the possible reasons are that clozapine affects mainly the mesolimbic dopamine system, that it blocks D1 receptors to the same extent as D2 receptors, that it has a high affinity to D4 receptors, and that it also has high affinity to serotonin receptors. This theoretical framework was the basis for the development of various second-generation antipsychotics (see Table 4). Most of them have high affinities for serotonin receptors, although only one substance (olanzapine) is similar to clozapine in the equivalent binding to D1 and D2 receptors. So far there is no indication that any of these newer drugs carries a substantial risk of agranulocytosis, the most dangerous side effect of clozapine. Moreover, these substances have shown to be effective when compared to first-generation antipsychotic drugs and indeed to induce fewer extrapyramidal side effects. For two of them (olanzapine and risperidone) there are data suggesting that they also might ameliorate negative symptoms. However, it still remains to be seen how the effectiveness of these newer drugs compares to clozapine. Treatment with antipsychotic drugs in schizophrenic patients is usually a long-term treatment continuing for many years. Intermittently, patients may need additional other psychotropic medications. Most frequently anxiolytics, hypnotics, and antidepressants are administered. The recognition and treatment of intervening depressive episodes is particularly important because depressive and negative symptoms are sometimes difficult to differentiate, but need different treatment. When administering antidepressant drugs to schizophrenic patients, one should be aware that during treatment with antidepressants positive symptoms sometimes exacerbate. 4.2 Psychosocial Treatment It is now generally accepted that psychosocial treatment of patients with schizophrenia is not an alternative to pharmacological treatment, but rather an important complementary approach (Bustillo et al. 2000). Psychosocial measures yield little benefit for positive symptoms; in contrast, intensive psycho13550
dynamic psychotherapy might induce exacerbations. The most important and successful interventions are psychoeducative teaching and training in social skills and problem solving. Although it is still debated whether these treatments are directly effective against negative symptoms, there is no doubt that they considerably improve coping with these symptoms. The major goal of educational approaches is to increase the knowledge about and the insight into the disease process and, as a consequence, to improve treatment compliance. Educational programs are focused on knowledge about the disease process and it’s course, the mode of action and side effects of medication, early signs of exacerbation such as restlessness and insomnia, and detailed information about the various community based or hospital facilities available. These programs are particularly effective when important family members are educated as well. Moreover, approaches to improve the communication styles within the patient’s network of social relationships are often useful, although the concept that particular forms of communication (‘high expressed emotions’) play a causative role in schizophrenia has been seriously challenged. Schizophrenic patients’ social skills are often poor. This is one major reason for the often dramatically reduced amount of social contacts. Training of social skills in patients suffering from schizophrenia is performed in a group setting. Such groups are focused on the repetitive training of behavior in various social situations of importance for everyday life. In addition, some rather basic teaching of the theoretical background of successful social interaction might be included. Impaired problem solving in schizophrenic patients is due to a combination of disturbances in divided attention, planning, and motivation. Problem solving training includes the analysis of problems relevant to daily life and, particularly, dividing a problem into subproblems that can be successively solved. Similar to the training of social skills, the steps of the solution are practiced repeatedly. In recent years, more complex psychotherapeutic approaches have been developed, combining cognitive, behavioral, or psychodynamic approaches, which, however, are suitable for the minority of patients only. In particular for chronic patients who in earlier times often were hospitalized for decades, providing supervised residential living arrangements has proven to be of considerable benefit because in this setting the degree of self-sustainment can be maximized in parallel with the permanent provision of professional help. 4.3 Emergency Treatment Schizophrenia is not only a debilitating but also a lifethreatening disorder. If severe positive symptoms are present, such as verbal imperative hallucinations,
Schizophrenia, Treatment of Table 4 Relative receptor affinities of selected newer antipsychotic compounds compared to the first-generation drug haloperidol. [Dopamine]
Haloperidola Clozapine Risperidone Olanzapine Seroquel Sertindole Ziprasidone
D " jjj jj jj jjj j jjj j
D # jjjj jj jjjj jjj jj jjjjj jjjj
[Serotonin] 5HT " .. j jj .. .. jj *
5HT α #
[Noradrenaline]
j jjj jjjjj jjj j jjjj jjjj
α " jj jjj jjj jjj jjjj jjj jj
A
#
.. jjj jjj .. j .. ..
Histamine Acetylcholine H M " " .. .. jjjj jjjjj jj .. jjjj jjjjj jjjj jjj j j j ..
Adapted from Pickar, 1995 a Prototype first generation antipsychotic drug
delusions of persecution, or severe formal thought disturbances, the patient’s life is threatened by a misinterpretation of his or her environment (e.g., of dangerous objects or situations) and by suicidal intentions. Suicidality can also emerge when negative symptoms are prominent. Acute suicidality and an acute confusional psychotic state both necessitate immediate treatment, including the administration of anxiolytics, antipsychotics, or both, preferably in a hospital setting. Although every effort should be undertaken to get the informed consent of the patient, it is often necessary to initiate treatment against his or her explicit will. This has to be done in accordance with the legal regulations in the respective country. Less frequently occurring life-threatening situations in patients suffering from schizophrenia are febrile catatonia and the neuroleptic malignant syndrome, a rare side effect of antipsychotic drugs (Pelonero et al. 1998). Both conditions show similarities, including fever, disturbances of motor behavior, and reduced responsiveness to external stimuli. The differential diagnosis is difficult, but very important because in febrile catatonia treatment with antipsychotic drugs is necessary, whereas such treatment has to be immediately stopped when a neuroleptic malignant syndrome is suspected. In both of these conditions admission to an intensive care unit is necessary.
5. Future Perspecties on the Treatment of Schizophrenia Despite the rapid development of the knowledge about human genetics, it is unlikely that we will see a major advance in our understanding of the genetic causes of schizophrenia in the near future. The reason is that the available epidemiological data render a major contribution of single genes quite unlikely and suggest that multiple genes contribute to the susceptibility to schizophrenia. However, it is likely that genetic investigations will improve the management of schizophrenia by enabling prediction of the treatment response to antipsychotic drugs. Very recently, for
example, it has been shown that a combination of six polymorphisms in neurotransmitter-receptor-related genes enable a significant prediction of the treatment response to clozapine (Arranz et al. 2000). Although these preliminary studies must be replicated and extended to other drugs, it is probable that this approach will significantly enhance the success of drug treatment in schizophrenia. It is also likely that the near future will bring new drugs which are the result of a ‘fine-tuning’ of the chemical structure and the binding profile to dopamine and serotonin receptors of those substances which are presently available. Another approach which is currently being pursued is to target glutamate receptors, although first clinical trials are only in part encouraging (Farber et al. 1999). Based on a theory implicating the immune system in the pathophysiology of schizophrenia, studies are being started that use immunomodulation as a treatment. This is an interesting and promising approach, which also might help to understand the mode of action of presently available antipsychotic drugs because some of them influence the production of cytokines which are important immune mediators (Pollma$ cher et al. in press). Another approach based on pathophysiological theories is to target phospholipids. These are an important cell-membrane component and there is some evidence suggesting that either their uptake is deficient or that they are excessively broken down in schizophrenic patients (Horrobin 1999). See also: Nosology in Psychiatry
Bibliography American Psychiatric Association 1994 Diagnostic and Statistical Manual of Mental Disorders. American Psychiatric Association, Washington, DC Andreasen N C, Arndt S, Alliger R, Miller D, Flaum M 1995 Symptoms of schizophrenia. Methods, meanings, and mechanisms. Archies of General Psychiatry 52: 341–51
13551
Schizophrenia, Treatment of Andreasen N C, Paradiso S, O’Leary D S 1998 ‘‘Cognitive dysmetria’’ as an integrative theory of schizophrenia: a dysfunction in cortical-subcortical-cerebellar circuitry? Schizophrenia Bulletin 24: 203–18 Angst J, Bente D, Berner P, Heimann H, Helmchen H, Hippius H 1971 Das klinische Wirkungsbild von Clozapin. Pharmakopsychiatrie 4: 201–11 Arranz M J, Munro J, Birkett J, Bolonna A, Mancama D, Sodhi M, Lesch K P, Meyer J F, Sham P, Collier D A, Murray R M, Kerwin R W 2000 Pharmacogenetic prediction of clozapine response. Lancet 355: 1615–16 Baldessarini R J, Frankenburg F R 1991 Drug therapy – Clozapine. A novel antipsychotic agent. New England Journal of Medicine 324: 746–54 Bustillo J, Keith S J, Lauriello J 2000 Schizophrenia: psychosocial treatment. In: Sadock B J, Sadock V A (eds.) Kaplan & Sadock’s Comprehensie Textbook of Psychiatry. 7th edn. Lippincott Williams and Wilkins, Baltimore, MD, pp. 1210–7 Delay J, Deniker P 1952 38 cas de psychoses traite! es par la cure prolonge! e et continue de 4560 R.P. Annales me! dicopsychologique 110: 364–741 Farber N B, Newcomer J W, Olney J W 1999 Glycine agonists: what can they teach us about schizophrenia? Archies of General Psychiatry 56: 13–17 Fink M, Sackeim H A 1996 Convulsive therapy in schizophrenia? Schizophrenia Bulletin 22: 27–39 Horrobin D F 1999 Lipid metabolism, human evolution and schizophrenia. Prostaglandins Leukotrienes and Essential Fatty Acids 60: 431–7 Kraepelin E 1910 Psychiatrie. Ein Lehrbuch fa` r Studierende und An rzte. 8 Auflage. Barth, Leipzig Lieberman J A 1999 Is schizophrenia a neurodegenerative disorder? A clinical and neurobiological perspective. Biological Psychiatry 46: 729–39 Owen M J, Cardno A G 1999 Psychiatric genetics: progress, problems, and potential. Lancet 354 (Suppl. 1): SI11–14 Pelonero A L, Levenson J L, Pandurangi A K 1998 Neuroleptic malignant syndrome: a review. Psychiatric Serices 49: 1163–72 Pickar D 1995 Prospects for pharmacotherapy of schizophrenia. Lancet 345: 557–62 Pollma$ cher T, Haack M, Schuld A, Kraus T, Hinze-Selch D in press Effects of antipsychotic drugs on cytokine networks. Journal of Psychiatric Research Schultz S K, Andreasen N C 1999 Schizophrenia. Lancet 353: 1425–30 World Health Organization 1996 International Statistical Classification of Diseases. World Health Organization, Geneva, Switzerland
during the last decade (see also School Outcomes: Cognitie Function, Achieements, Social Skills, and Values). In particular, large-scale projects like TIMSS (Third International Mathematics and Science Study) (Beaton et al. 1996) had a strong and enduring impact on educational policy as well as on educational research and have also raised the public interest in scholastic achievement.
1. School Achieement and its Determinants: An Oeriew SA can be characterized as cognitive learning outcomes which are products of instruction or aimed at by instruction within a school context. Cognitive outcomes mainly comprise procedural and declarative knowledge but also problem-solving skills and strategies. The following facets, features, and dimensions of SA, most of which are self-explaining, can be distinguished: (a) episodic vs. cumulative; (b) general\global vs. domain-specific; (c) referring to different parts of a given subject (in a foreign language e.g., spelling, reading, writing, communicating, or from the competency perspective, grammatical, lexical, phonological, and orthographical); (d) test-based vs. teacher-rated; (e) performance-related vs. competence-related; (f) actual versus potential (what one could achieve, given optimal support); (g) different levels of aggregation (school, classroom, individual level); (h) curriculum-based vs. cross-curricular or extracurricular. Figure 1 shows a theoretical model that represents central determinants of SA. Cognitive and motivational determinants are embedded in a complex system of individual, parental, and school-related determinants and depend on the given social, classroom, and cultural context. According to this model, cognitive and motivational aptitudes have a direct impact on learning and SA, whereas the impact of the other determinants in the model on SA is only indirect.
T. Pollma$ cher
School Achievement: Cognitive and Motivational Determinants Although the analysis of school achievement (SA)—its structure, determinants, correlates, and consequences—has always been a major issue of educational psychology, interest in achievement as a central outcome of schooling has considerably increased 13552
Figure 1 Interplay of individual cognitive and motivational and other determinants of SA
School Achieement: Cognitie and Motiational Determinants
2. Cognitie Determinants 2.1 Intelligence Intelligence (see Intelligence, Prior Knowledge, and Learning) is certainly one of the most important determinants of SA. Most definitions of intelligence refer to abstract thinking, ability to learn, and problem solving, and emphasize the ability to adapt to novel situations and tasks. There are different theoretical perspectives, for example Piagetian and neo-Piagetian approaches, information-processing approaches, component models, contextual approaches, and newer integrative conceptions such as Sternberg’s triarchic theory of intelligence, or Gardner’s model of multiple intelligences. Among them, the traditional psychometric approach is the basis for the study of individual differences. In this approach quantitative test scores are analyzed by statistical methods such as factor analysis to identify dimensions or factors underlying test performance (Sternberg and Kaufman 1998). Several factor theories have been proposed ranging from a general-factor model to various multiple-factor models. One of the most significant contributions has been a hierarchical model based on second-order factors that distinguishes between fluid and crystallized abilities. Fluid intelligence refers to basic knowledge-free information-processing capacity such as detecting relations within figural material, whereas crystallized intelligence reflects influences of acculturation such as verbal knowledge and learned strategies. Fluid and crystallized abilities can both be seen as representing general intelligence as some versions of a complete hierarchical model would suggest (see Gustafsson and Undheim 1996). Substantial correlations (the size of the correlations differ from r l 0.50 to r l 0.60) between general intelligence and SA have been reported in many studies. That is, more than 25 percent of variance in post-test scores are accounted for by intelligence, depending on student age, achievement criteria, or time interval between measurement of intelligence and SA. General intelligence, especially fluid reasoning abilities, can be seen as a measure of flexible adaptation of strategies in novel situations and complex tasks which put heavy demands and processing loads on problem solvers. The close correlation between intelligence and SA has formed the basis for the popular paradigm of underachievement and overachievement: Students whose academic achievement performance is lower than predicted on the basis of their level of intelligence are characterized as underachievers; in the reverse case as overachievers. This concept, however, has been criticized as it focuses on intelligence as the sole predictor of SA although SA is obviously determined by many other individual as well as instructional variables. Intelligence is related to quality of instruction. Low
instructional quality forces students to fill in gaps for themselves, detect relations, infer key concepts, and develop their own strategies. On the whole, more intelligent students are able to recognize solutionrelevant rules and to solve problems more quickly and more efficiently. Additionally, it is just that ability that helps them to acquire a rich knowledge base which is ‘more intelligently’ organized and more flexibly utilizable and so has an important impact on following learning processes. Accordingly, intelligence is often more or less equated with learning ability (see Gustafsson and Undheim 1996). Beyond general intelligence, specific cognitive abilities do not seem to have much differential predictive power, although there is occasional evidence for a separate contribution of specific abilities (Gustafsson and Undheim 1996). In Gardner’s theory of multiple intelligences there are seven abilities (logicalmathematical, linguistic, spatial, musical, bodilykinesthetic, interpersonal, and intrapersonal intelligence). Among domain-specific abilities, reading ability that is related to linguistic intelligence in Gardner’s model, has some special significance. Specific abilities such as phonological awareness have turned out as early predictors of reading achievement.
2.2 Learning Styles and Learning Strategies Styles refer to characteristic modes of thinking that develop in combination with personality factors (see Cognitie Styles and Learning Styles). While general cognitive or information-processing styles are important constructs within differential psychology, learning styles are considered as determinants of SA. Learning styles are general approaches toward learning; they represent combinations of broad motivational orientations and preferences for strategic processing in learning. Prominent learning style conceptions mainly differ with respect to intrinsic and extrinsic goal orientations and surface processing vs. deep processing. Intrinsic vs. extrinsic motivation deals with whether task fulfillment is a goal of its own or a means for superordinate goals. In deep processing, learners try to reach a comprehension of tasks as complete as possible to incorporate their representations into their own knowledge structures and give them personal significance. In surface processing, learners confine processing to the point where they can reproduce the material in an examination or achievement situation as well as possible (Snow et al. 1996). Compared to general learning styles, learning strategies are more specific: They represent goal-oriented endeavors to influence one’s own learning behavior. While former research mostly dealt with simple study skills that were seen as observable behavioral techniques, now cognitive and metacognitive strategies are included. Learning strategies therefore comprise cognitive strategies (rehearsal, organization, elaboration) 13553
School Achieement: Cognitie and Motiational Determinants and metacognitive strategies (planning, monitoring, regulation) as well as resource-oriented strategies (creating a favorable learning environment, controlling attention, and sustaining concentration) (Snow and Swanson 1992). Frequently, learning strategies are assessed by means of questionnaires. In spite of the great importance learning strategies have for understanding knowledge acquisition and learning from instruction, only few empirical data concerning the relation between learning strategies and SA are available. Most studies done in the context of schools and universities have shown positive but small correlations between learning strategies and SA. One reason may be that assessment by questionnaires is too general. On the whole, the exact conditions under which learning strategies are predictive to achievement have still to be worked out. Besides motivational and metacognitive factors, epistemological beliefs are important determinants for learning strategies and achievement. They refer to beliefs about the nature of knowledge and learning such as: knowledge acquisition depends on innate abilities, knowledge is simple and unambiguous, and learning is quick. Epistemological beliefs can influence quality of achievement and persistence on difficult tasks (Schommer 1994).
2.3 Prior Knowledge Although there have been early attempts to identify knowledge prerequisites for learning, the role of prior knowledge (see Intelligence, Prior Knowledge, and Learning) as a determinant of SA has long been neglected. Since the early 1980s, research on expertise has convincingly demonstrated that superior performance of experts is mainly caused by the greater quantity and quality of their knowledge bases (see Expertise, Acquisition of). Prior knowledge does not only comprise domain-specific content knowledge, that is, declarative, procedural, and strategic knowledge, but also metacognitive knowledge, and it refers to explicit knowledge as well as to tacit knowledge (Dochy 1992). Recent research has shown that taskspecific and domain-specific prior knowledge often has a higher predictive power for SA than intelligence: 30 to 60 percent of variance in post-test scores is accounted for by prior knowledge (Dochy 1992, Weinert and Helmke 1998). Further results concern the joint effects of intelligence and prior knowledge on SA. First, there is a considerable degree of overlap in the predictive value of intelligence and prior knowledge for SA. Second, lack of domain-specific knowledge cannot be compensated for by intelligence (Helmke and Weinert 1999). But learning is not only hindered by a lack of prior knowledge, but also by misconceptions many learners have acquired in their interactions with everyday problems. For example, students often have naive 13554
convictions about physical phenomena that contradict fundamental principles like conservation of motion. These misconceptions, which are deeply rooted in people’s naive views of the world, often are in conflict with new knowledge to be acquired by instruction. To prevent and overcome these conflicts and to initiate processes of conceptual change is an important challenge for classroom instruction.
3. Motiational Determinants 3.1 Self-concept of Ability The self-concept of ability (largely equivalent with subjective competence, achievement-related selfconfidence, expectation of success, self-efficacy; see Self-concepts: Educational Aspects; Self-efficacy: Educational Aspects) represents the expectancy component within the framework of an expectancy x value approach, according to which subjective competence (expectancy aspect) and subjective importance (value aspect) are central components of motivation. Substantial correlations between self-concept of ability and SA have been found. Correlations are the higher, the more domain-specific the self-concept of ability is conceptualized, the higher it is, and the older the pupils are. Self-concept of ability is negatively related to test anxiety (see Test Anxiety and Academic Achieement) and influences scholastic performance by means of various mechanisms: Students with a high self-concept of ability (a) initiate learning activities more easily and quickly and tend less to procrastination behavior, (b) are more apt to continue learning and achievement activities in difficult situations (e.g., when a task is unexpectedly difficult), (c) show more persistence, (d) are better protected against interfering cognitions such as self-doubt and other worries (Helmke and Weinert 1997).
3.2 Attitude towards Learning, Motiation to Learn, and Learning Interest There are several strongly interrelated concepts that are associated with the subjective value—of the domain or subject under consideration, of the respective activities, or on a more generalized level, of teachers and school. The value can refer to the affective aspect of an attitude, to subjective utility, relevance, or salience. In particular, attitude toward learning means the affective (negative or positive) aspect of the orientation towards learning; interest is a central element of self-determined action and a component of intrinsic motivation, and motivation to learn comprises subjective expectations as well as incentive values (besides anticipated consequences like pride, sorrow, shame, or reactions of significant others, the
School Achieement: Cognitie and Motiational Determinants incentive of learning action itself, i.e., interest in activity). Correlations between SA and those constructs are found to be positive but not very strong (the most powerful is interest with correlations in the range of r l 0.40). These modest relations indicate that the causal path from interest, etc. to SA is far and complex and that various mediation processes and context variables must be taken into account (Helmke and Weinert 1997).
3.3 Volitional Determinants Motivation is a necessary but often not sufficient condition for the initiation of learning and for SA. To understand why some people—in spite of sufficient motivation—fail to transform their learning intentions into correspondent learning behavior, volitional concepts have proven to be helpful. Recent research on volition has focused on forms of action control, especially the ability to protect learning intentions against competitive tendencies, using concepts like ‘action vs. state orientation’ (Kuhl 1992). The few empirical studies that have correlated volitional factors with SA demonstrate a nonuniform picture with predominantly low correlations. This might be due to the fact that these variables are primarily significant for self-regulated learning (see Self-regulated Learning) and less important in the typical school setting, where learning activities and goals are (at least for younger pupils) strongly prestructured and controlled by the teacher (Corno and Snow 1986). Apart from that there certainly cannot be expected any simple, direct, linear correlations between volitional characteristics and SA, but complex interactions and manifold possibilities of mutual compensation, e.g., of inefficient learning strategies by increased effort.
appear to profit from a high degree of structuring and suffer from a too open, unstructured learning atmosphere, whereas the reverse is true for self-confident pupils with a solid base of prior knowledge (Corno and Snow 1986). Research has shown a large variation between classrooms, concerning the relation between anxiety and achievement as well as between intelligence and achievement (Helmke and Weinert 1999). A similar process of functional compensation has been demonstrated for self-concept (Weinert and Helmke 1987). (c) Culture specificity. Whereas the patterns and mechanisms of basic cognitive processes that are crucial for learning and achievement are probably universal, many relations shown in Fig. 1 depend on cultural background. For example, the ‘Chinese learner’ does not only show (in the average) a higher level of effort, but the functional role of cognitive processes such as rehearsal is different from the equivalent processes of Western students (Watkins and Biggs 1996). (d) Dynamic interplay. There is a dynamic interplay between SA and its individual motivational determinants: SA is affected by motivation, e.g., selfconcept, and affects motivation itself. From this perspective, SA and its determinants change their states as independent and dependent variables. The degree to which the mutual impact (reciprocity) of academic self-concept and SA is balanced has been the issue of controversy (skill development vs. selfenhancement approach). See also: Academic Achievement: Cultural and Social Influences; Academic Achievement Motivation, Development of; Cognitive Development: Learning and Instruction; Intelligence, Prior Knowledge, and Learning; Learning to Learn; Motivation, Learning, and Instruction; Test Anxiety and Academic Achievement
4. Further Perspecties In Fig. 1 only single determinants and ‘main’ effects of cognitive and motivational determinants on SA were considered. Actually, complex interactions and context specificity as well as the dynamic interplay between variables have to be taken into account. The following points appear important: (a) Interactions among arious indiidual determinants. For example, maximum performance necessarily requires high degrees of both intelligence and of effort, whereas in the zone of normal\average achievement, lack of intelligence can be compensated (as long as it does not drop below a critical threshold value) by increased effort (and vice versa). (b) Aptitude x treatment interactions and classroom context. Whether test anxiety exerts a strong negative or a low negative impact on SA depends on aspects of the classroom context and on the characteristics of instruction. For example, high test-anxious pupils
Bibliography Beaton A E, Mullis I V S, Martin M O, Gonzales D L, Smith T A 1996 Mathematics Achieement in the Middle School Years. TIMSS International Study Center, Boston Corno L, Snow R E 1986 Adapting teaching to individual differences among learners. In: Wittrock M C (ed.) Handbook of Research on Teaching, 3rd edn. Macmillan, New York, pp. 255–96 Dochy F J R C 1992 Assessment of Prior Knowledge as a Determinant for Future Learning: The Use of Prior Knowledge State Tests and Knowledge Profiles. Kingsley, London Gustafsson J-E, Undheim J O 1996 Individual differences in cognitive functions. In: Berliner D C, Calfee R C (eds.) Handbook of Educational Psychology. Simon & Schuster Macmillan, New York, pp. 186–242 Helmke A, Weinert F E 1997 Bedingungsfaktoren schulischer Leistungen. In: Weinert F E (ed.) EnzyklopaW die der Psychologie. PaW dagogische Psychologie. Psychologie des Unter-
13555
School Achieement: Cognitie and Motiational Determinants richts und der Schule. Hogrefe, Go$ ttingen, Germany, pp. 71–176 Helmke A, Weinert F E 1999 Schooling and the development of achievement differences. In: Weinert F E, Schneider W (eds.) Indiidual Deelopment from 3 to 12. Cambridge University Press, Cambridge, UK, pp. 176–92 Kuhl J 1992 A theory of self-regulation: Action versus state orientation, self-discrimination, and some applications. Applied Psychology: An International Reiew 41: 97–129 Schommer M 1994 An emerging conceptualization of epistemological beliefs and their role in learning. In: Garner R, Alexander P A (eds.) Beliefs about Text and Instruction with Text. Erlbaum, Hillsdale, NJ, pp. 25–40 Snow R E, Corno L, Jackson D 1996 Individual differences in affective and conative functions. In: Berliner D C, Calfee R C (eds.) Handbook of Educational Psychology. Simon & Schuster Macmillan, New York, pp. 243–310 Snow R E, Swanson J 1992 Instructional psychology: Aptitude, adaptation, and assessment. Annual Reiew of Psychology 43: 583–626 Sternberg R J, Kaufman J C 1998 Human abilities. Annual Reiew of Psychology 49: 479–502 Watkins D A, Biggs J B 1996 The Chinese Learner: Cultural, Psychological, and Contextual Influences. CERC and ACER, Hong Kong Weinert F E, Helmke A 1987 Compensatory effects of student self-concept and instructional quality on academic achievement. In: Halisch F, Kuhl J (eds.) Motiation, Intention, and Volition. Springer, Berlin, pp. 233–47 Weinert F E, Helmke A 1998 The neglected role of individual differences in theoretical models of cognitive development. Learning and Instruction 8(4): 309–23
A. Helmke and F.-W. Schrader
School Administration as a Field of Inquiry Inquiry on the administration of education reflects a variety of intellectual and social influences. Those influences and the main trends in the field’s scholarship are examined.
1. The Ecology of an Applied Field of Inquiry The ecology of an applied field of inquiry reflects its indigenous theory and research and that of related fields, demands from the worlds of policy, practice, and professional preparation, and the spirit of the times which usually mirrors societal and global trends. School administration (often educational administration) is a field of many specializations in such areas as politics, organizational studies, fiscal affairs, law, and philosophical issues or, in less-disciplined oriented areas, such as school effectiveness, leadership and supervision, human resource management and labor relations, and equity issues. 13556
Explanations about subject matter are developed in areas of this sort. As in other fields, theoretical plausibility is judged using various logical and evidentiary criteria. However, applied fields also attend to relevance for practice and implications for assorted conceptions of organizational improvement.
2. A Brief History Educational administration became a recognizable academic field in the early twentieth century in North America and later, in other parts of the world (Campbell et al. 1987, Culbertson 1988). Initially, the field was oriented to practical matters and broad issues reflecting pedagogical and societal values. An example was democratic administration, especially popular before and after World War II. In the 1950s, the field began to adopt social science theories and methods. In the mid-1970s, the subjectivistic, neo-Marxist (largely as critical theory), and identity politics perspectives, already popular in the social sciences and humanities, reached the field, with postmodernism a later addition. These perspectives typically were critical of science, which generated extensive debate. The social science emphasis led to specialization along disciplinary lines that resulted in considerable research, much of it reported in the Encyclopedias of Educational Research issued in roughly 10-year cycles, with 1992 the most recent, in two International Encyclopedias of Education issued in 1975 and 1994, and in the first Handbook of Research on Educational Administration (Boyan 1988). The second Handbook (Murphy and Louis 1999) was more fragmented and factious, often mixing research with philosophical commentary.
3. Current Trends Current trends in inquiry into school administration include conflicting views of knowledge and ethics, efforts to bridge the gaps between theory on the one hand and policy and practice on the other, more attention to making sense of complexity, and an emerging literature on comparative international aspects of school administration.
3.1 Conflicting Views The controversies about knowledge and appropriate methods of seeking it, and about ethics, are ultimately philosophical. The contending positions are similar across the social sciences and humanities, although each field’s literature reflects its peculiarities. The main strands of contending thought are grounded in different general philosophies. Subjectivism, a version of idealism, stresses mind, reason,
School Administration as a Field of Inquiry and intuition. Science is criticized for devaluing humanistic concerns in favor of objectivity. In ethics, there is usually a hierarchy of values with the highest being absolute. In educational administration, Greenfield was this movement’s leading early figure, while Hodgkinson remains its best known scholar, with special contributions to ethical theory ( Willower and Forsyth 1999; see also relevant entries in the 1992 Encyclopedia of Educational Research and the 1994 International Encyclopedia of Education). Critical theory, with its historical roots in the Frankfurt School’s search for a more contemporary Marxism, is devoted to a reformist agenda (revolution is out of fashion) and an ethic of emancipation. Its ideology now goes beyond class to stress race and gender, although advocates of identity politics may not endorse critical theory. Suspicious of science, described as serving the purposes of the ruling class, critical theory’s literature attends mainly to unjust social arrangements. There are many adherents across subfields of education, with William Foster noteworthy in educational administration. Generally, they contend that schools serve the powerful, giving short shrift to the disadvantaged. The antidote is political action, including radicalized educators. Postmodernists and poststructuralists reject metanarratives or broad theories of every kind, including scientific and ethical ones, believing such theories totalize and ignore difference. This leads to an emphasis on text or words; all that is left after efforts to understand and improve the world are disallowed. Texts are deconstructed, or examined for their assumptions, including the ways they oppress and ‘trivialize’ otherness, by what is said and omitted. This view has only recently received much attention in educational administration: the August 1998 issue of Educational Administration Quarterly was on postmodernism and the September 1998 Journal of School Leadership featured a debate on that perspective. In applied fields such as administration, postmodernism may be selectively combined with critical theory. This was discussed by Alvesson and Deetz (1996) and illustrated in the June 1998 issue of Administratie Science Quarterly on critical perspectives, where the writings of Foucault on power and control were often cited. It remains to add that Derrida, perhaps the leading scholar in the postmodern-poststructuralist camp, recently (1994) appeared to argue for a relaxing of strictures against metanarratives to allow for radicalized critique, which is what deconstruction has been all along. This suggests recognition that postmodern relativism and nihilism have made it irrelevant to the world of human activity and practice. Adherents of the views sketched often associate science with positivism, a long dead perspective that flourished in the Vienna Circle (1924–1936). Using stringent standards of verification, positivists saw metaphysics as pointless, and values as mere pref-
erences. Although it held on longer in fields such as psychology, positivism lost favor and was folded into the more trenchant analytical philosophy. Nevertheless, positivism has been disinterred to serve as a target in philosophical disputes. Naturalistic and pragmatist philosophies have been the main sources of support for proponents of scientific inquiry in educational administration. Science is seen as a human activity that seeks explanations of how things work, that can be subjected to public assessment. It is an open, growing, enterprise that is fallible but self-corrective, as better methods and theories displace older ones. Based on logic and evidence, its results have been highly successful from the standpoints of the development of plausible theories and of benefiting humankind. Claims about uncovering an ultimate reality or final truths are not made; they are inconsistent with a self-rectifying conception of inquiry. In ethics, absolute principles and pregiven ideologies are rejected in favor of an inquiry-based process that examines moral choices in concrete situations. Competing alternatives are appraised in terms of likely consequences using relevant concepts and theories, and by clarifying applicable principles. Such principles are derived from cumulative moral experience and are guides, not absolutes (see Willower and Forsyth 1999). These contrasting philosophical approaches were sketched to suggest the substance of contemporary debate, because of their implications for scholarship in school administration. Qualitative research, for instance, is a mainstay of subjectivists and those critical theorists who do ‘critical ethnography.’ Postmodernists are less explicit about research, but several articles in the Educational Administration Quarterly issue on postmodernism reported field research. Qualitative studies were done in educational administration long before these views gained popularity, but have become more widespread, and the literature now includes work aimed at demonstrating injustices, rather than building theory. Changing philosophical emphases can legitimate new subject matter and methods, as shown by comparing the second Handbook of Research on Educational Administration (Murphy and Louis 1999) with the first (Boyan 1988). However, philosophical influences are filtered by variations of interest and practice. Work may borrow selectively, sometimes from conflicting philosophies, or tailor philosophical ideas to special purposes, or ignore them entirely. For instance, in ethics, single concepts such as equity, caring, or community often are stressed, sometimes along with visions of the ideal school. More philosophical explorations of moral principles and their applications are scarcer. In epistemology, many writers in antiscience camps argue against positivistic ‘hegemony,’ while ignoring formidable views of inquiry. Hence, many disputes are about politics rather than theories of knowledge. In educational administration, examples of broad philo13557
School Administration as a Field of Inquiry sophical efforts are Evers and Lakomski’s joint work on epistemology and Hodgkinson’s and Willower’s respective approaches to ethics and to inquiry (See Boyan 1988 and Willower and Forsyth 1999). The substitution of politics for philosophical substance can be found in most social sciences and humanities, reflecting the influences of critical theory, postmodernism, and identity politics. Various commentators see these influences as declining because of their one-sidedness and practical irrelevance. However, they remain visible, claiming a share of the literatures of assorted specializations, including educational administration. 3.2 Theory and Practice In school administration, current issues command much attention; theoretical discussions have a more specialized appeal. Devolution, school site based management, and restructured decision processes are examples of a type of contemporary reform that is the object of wide attention. Such reform results from recent trends in government and politics, but also is the current incarnation of a long line of participatorytype schemes, especially seen in Western countries. They range across democratic administration, human relations, organizational development, open climates and participatory styles of leadership, organizational health, and staff empowerment. This sort of reform is oriented to practice, but has theoretical grounding because openness and participation are often held to be paths to organizational improvements, such as increased staff motivation and commitment. However, most school improvement efforts make demands on time, erode teacher autonomy, and disturb orderly routines, adding to staff overload. Counter forces include pressures to display legitimating progress and educator hopes for positive student outcomes. Such contradictory pressures are characteristic of attempts to improve school practices, but how they play out depends on contexts and contingencies. Another contemporary reform is privatization. It stresses consumer choice, efficiency, and economy. Often linked to efforts to change governmental policies on education, it is squarely in the political arena. While reforms of policy and organization aim at changing practice, students of cognition and learning have sought to understand how practitioners solve problems and make decisions. Influenced by Simon and the ‘Carnegie School,’ this work emphasizes domain-general problem-solving processes and domain-specific knowledge. Research comparing expert and novice decision makers shows the importance of both. In educational administration, the investigations of Leithwood and his associates (Leithwood and Steinbach 1995) have been noteworthy (for other sources, see Willower and Forsyth 1999). Professional preparation programs for school administrators use many approaches to transfer learning 13558
to practice. They include case studies, simulations (sometimes computer based), films, approximations of virtual reality, internships, and other school site based activities. Program improvement has benefited from the cooperative efforts of professorial and administrator associations ( Willower and Forsyth 1999). Improving the practice of school administrators and educational reform is, like inquiry, ever unfinished. Current successes do not guarantee future ones, despite progress made in developing explanative theories, and continuing efforts to improve schools and the preparation of administrators. A barrier to the implementation of theory-based reforms are practice is the multiplicity and complexity of the influences that impinge on the schools, considered next. 3.3 Making Sense of Complexity School administration appears to be more complex than many forms of management. Schools tend to be vulnerable to community and societal forces, are usually subject to substantial regulation (no surprise since they serve society’s children), and have difficulty demonstrating effectiveness. The latter occurs because of the complexity of showing the school’s part in student learning vs. that of family, variations in student motivation and ability, and other factors. Further, schools are commonly expected to foster subject matter achievement and acceptable social skills and conduct. Beyond that, school administrators oversee personnel who ordinarily define themselves as professionals and value latitude in their work. Thus, school organizations are characterized by external pressures and internal forces that heighten complexity. In addition, social changes have placed new demands on schools as societal arrangements that provided regulatory settings for children and youth, have begun to erode. As this occurred, schools increasingly have had to cope with an array of social ills, for instance, substance abuse, vandalism, and violence, against a backdrop that too often includes dysfunctional families and subcultures with norms that devalue learning, along with the miseducative effects on young people of ubiquitous mass media. As a result, many schools face expanded responsibilities, adding new complexities to the administration of schools. Such complexity can be daunting. One response in education, as in business, has been a susceptibility to fads, with total quality management as a recent example. These fads may include some reasonable ideas, commonly packaged as part of a larger program. Although often treated as panaceas, they are typically short-lived. Such fads can be explained as attempts to give perceived legitimacy to organizational efforts to confront what in reality are intractable problems. While practitioners adapted to changing social conditions and increasing complexity, those who study school administration sought to gain a better understanding of such phenomena. More sophisticated
School Administration as a Field of Inquiry quantitative analyses facilitated by computer technology provided ways of examining relationships among many variables in numerous combinations, clusters, and sequences. Although empirical investigations rarely employed such analyses, their availability provides possible handles on complexity. More popular recently have been field studies, done as participant observation, ethnography, or case studies. While qualitative methods are favored by those wishing to advance a particular view using illustrations from selected ‘narratives,’ more traditional field work has long sought to plumb complexity by attending to the details of daily activity. Disputes about qualitative vs. quantitative research are not new, mirroring the richness vs. rigor debates in sociology in the 1930s (see Waller 1934). In educational administration, the dominant view appears to favor the use of a variety of methods recognizing that each has strengths and weaknesses. In any event, empirical studies in quantitative and qualitative modes have sought to make complexity more understandable, albeit in different ways. The increased recognition of complexity has rekindled interest in chaos theory in educational administration and the social sciences. Chaos theory, it is well to recall, is not a denial of the possibility of orderliness. Rather it is a search for odd, intricate, or many faceted patterns. Reviewing chaos scholarship, Griffiths et al. (1991) concluded enthusiasm for the theory had not resulted in meaningful social research. They suggested that chaos theory has limited applications in educational administration because of the theory’s dependency on precise measures. Conceptual efforts to accommodate fluidity and fluctuation are not new. To cite others, they range from Hegelian dialectical analysis with its clash of opposites, followed by a new synthesis, to threshold effects noted only when a variable reaches a certain level, with little or no gradation, to equifinality or functional equivalents where similar results stem from different sources. Ethical theory has also been concerned with complexity. Perspectives that recognize how moral choices are embedded in elaborate contexts, and attempt to include potential consequences of action as part of choice, are illustrative. Related are cognitive-type studies of problem solving and decision making and work on types of intelligence and their relationship to problem solving. Clearly, human experience is fluid and complex, and human behavior is often irrational and contradictory. Yet, in inquiry, the irrational is apprehended through rational processes, and experience is known through the use of concepts and explanations. Obviously, concepts and theories simplify, but so do intuition, lore, pregiven ideologies, aphorisms, and other ways of confronting experience. Nothing comprehends everything. The trick, in science and administration, is to include the relevant elements and get to the heart of the problem. The record on understanding complexity
in school administration, as in the social sciences, is one of incremental gain. The methods of scientific inquiry can make complexity more comprehensible and more manageable, but there are no panaceas.
3.4 Internationalization Educational administration is becoming internationalized. New preparation programs and journals are cropping up around the world. Established outlets such as the Journal of Educational Administration and Educational Administration Quarterly regularly feature comparative-international pieces, and research done cooperatively by investigators in different countries is increasingly appearing, as for example, in the July 1998 issue of the Journal of School Leadership. In addition to the International Encyclopedias of Education, World Yearbooks of Education, and others, new works are being published, for example the 1996 International Handbook of Educational Leadership and Administration. Scholarly associations such as the University Council for Educational Administration (in North America) and the Commonwealth Council for Educational Administration and Management continue to develop cooperative activities. Internationalization presents opportunities for inquiry that take advantage of different societal-cultural settings to examine the impact of contingencies on behavior and outcomes. Such research can show how the relationships of the variables under study are mediated by setting variables. This, in turn, can lead to new theoretical explanations, not to mention better international understanding. See also: Administration in Organizations; Educational Research and School Reform; School (Alternative Models): Ideas and Institutions; School as a Social System; School Effectiveness Research; School Management
Bibliography Alvesson M, Deetz S 1996 Critical theory and postmodernism approaches to organizational studies. In: Clegg S R, Hardy C, Nord W R (eds.) Handbook of Organizational Studies. Sage, London Boyan N J (ed.) 1988 Handbook of Research on Educational Administration. Longman, New York Campbell R F, Fleming T, Newell T J, Bennion J W 1987 A History of Thought and Practice in Educational Administration. Teachers College Press, New York Culbertson J A 1988 A century’s quest for a knowledge base. In: Boyan N J (ed.) Handbook of Research on Educational Administration. Longman, New York Derrida J 1994 Specters of Marx: The State of the Debt, the Work of Mourning, and the New International. Routlege, New York
13559
School Administration as a Field of Inquiry Griffiths D E, Hart A W, Blair B G 1991 Still another approach to administration: Chaos theory. Educational Administration Quarterly 27: 430–51 Leithwood K, Steinbach R 1995 Expert Problem Soling. State University of New York Press, Albany, NY Murphy J, Louis K S (eds.) 1999 Handbook of Research on Educational Administration, 2nd edn. Jossey-Bass, San Francisco Waller W 1934 Insight and scientific method. American Journal of Sociology 40: 285–97 Willower D J, Forsyth P B 1999 A brief history of scholarship on educational administration. In: Murphy J, Louis K S (eds.) Handbook of Research on Educational Administration, 2nd edn. Jossey-Bass, San Francisco, pp. 1–24
D. J. Willower
School (Alternative Models): Ideas and Institutions Ever since its beginnings at the turn of the eighteenth to nineteenth century, the modern educational system has been accompanied by alternative schools, the numbers and broad effects of which have varied over the course of the years. This coexistence is by no means based on independence, but is the manifestation of complex reciprocal relations which exist despite fundamental differences in the ideologies of the two approaches. Phases of intensive development in the domain of traditional schools have been mirrored by similar phases of intense activity in the alternative school movement; alternative schools fulfill important functions within the general spectrum of educational possibilities.
1. Background and Conception Because proponents of alternative schooling are critical of contemporary manifestations of the traditional school system, the focus of their critique has shifted over time. In general, however, their criticism is based on objections to the structures and functions of the modern school. They do not restrict their demands to a return to earlier historical epochs and structures in which schools still played a minor role. Their objectives are for duress (compulsory education and instruction) to be replaced by freedom; teacher-directed schooling and instruction by self-directed action on the part of the student; the curriculum by the individual needs of the student; competition and rivalry by community; compartmentalization by wholeness and integral methods in instruction; the separation of life and school by an interweaving of the two spheres (e.g., Holzman 1997). The fundamental criticism of traditional schooling is based on the conviction that this form of education is burdened by a basic con13560
tradiction: Although adults—and particularly adults living in modern democracies—are granted dignity, majority, and self-determination, children and adolescents at school are denied these basic rights, albeit on the pretext that they first have to learn how to exercise them. Because it is not possible to recreate the historical conditions which preceded institutionalized education and its inherent ambivalences, however, the critics of traditional schooling have to resort to socalled ‘counterschool’ initiatives (German: Gegenschulen). Although some of these are embedded in a variety of alternative organizations (e.g., flat-sharing communes, communal child care, and even dietetic measures), in general it is not society that is deschooled, but school itself. The extent to which this radical rearrangement leads to the complete disintegration of institutions is dependent on its (partly unintended) consequences.
2. Structural and Organizational Characteristics Pedagogical aims are reflected in the specific organizational characteristics of any given school (e.g., the general structure of the school, the relationships between teachers, parents, and students, the organization of classes, the methods implemented). In turn, these characteristics constitute the decisive prerequisites allowing many alternative institutions to exceed the norm with regard to content and in both the pedagogical and the social fields. Indeed, the typical approach of the alternative schools can be reduced to a simple formula: in contrast to the universal institutions of the traditional system, which are characterized by high levels of internal differentiation and complicated organizational structures, alternative institutions are distinguished by social clarity and immediacy owing to the fact that they are, from the outset, geared to the particularities of their respective clients. Accordingly, the student body in an alternative school on average numbers less than 200 (cf. Mintz et al. 1994). Thus, irrespective of content-related differences, the alternative school represents a special type of school or, more specifically, a type of school offering a special kind of education. While this offer may formally be an open one, it in fact only reaches the few who know how to take it up. Due to these invisible processes of selection, alternative schools can be expected to be more or less immune to many of the problems arising in schools which really are open to all (as is made clear by the reports and evaluations of alternative schools; cf. as a very vivid example Duke and Perry 1978). In accordance with the very definition of the alternative school, however, these assets cannot readily be transferred to the mass school system as a whole. Instead of having to simultaneously meet the needs of students from different backgrounds and with various levels of qualification, alternative schools are able to respond
School (Alternatie Models): Ideas and Institutions to individual differences and developments emanating from a more homogeneous context of attitudes and lifestyles. The additional potential for motivation and identification (in both students and teachers) which is inherent in a free choice of school or in the decision to embark upon a rather unusual experiment constitutes another factor which cannot easily be transferred to the traditional school. As a rule, ‘counterschools’ are spared the arduous coordination of pedagogical subfunctions on the basis of a division of labor: because of the small size of these institutions, instruction and education, counseling and care all rest in the hands of a few. Instead of having to strictly adhere to general and at least partly formulated rules in order to ensure the smooth running of the school and its classes, confidence in the closer personal ties and connections of the school community makes it possible to act more effortlessly, easily, and flexibly. Finally, the danger of merely running out of inspiration within the confines of internal school processes is reduced, as the high levels of input from the local environment mean that the system’s contours remain rather diffuse. The close links between the home and the school, which are not limited to financial support, but also involve the parents’ practical cooperation, can be seen as another important asset.
3. Deelopments Since the 1960s The concepts behind alternative schooling and its implementation reach far back in time, and in some respects they refer to the founders of modern educational theory (e.g., Rousseau or Herbart). Progressive education in the United States and so-called ReformpaW dagogik in Germany, both of which emerged after the end of the nineteenth century, may certainly be considered to be phases of activity to which a number of existing institutions can be traced back (e.g., Semel and Sadovnik 1999). However, the main wave of alternative school foundation resulted from political and social conditions including the failure or petering out of large-scale attempts at educational reform (e.g., Carnie et al. 1996); this holds especially for the Anglo-American countries since the 1960s. Most of the alternative institutions founded during this phase were set up in the United States, and the majority are still to be found there (cf. Mintz et al. 1994). Although less than 5 percent of students at elementary and secondary school actually attend alternative schools, the institutions now number several thousands in the United States, most of them elementary schools. The institutions are highly divergent in terms of their targets, methods, integration of parents and students, links with the traditional system, locus of control, and circles of addressees. It would thus seem justified to focus this overview of alternative schooling since the 1960s on the situation in the United States, all the more so because both the
fundamental principle of alternative schools (‘as many institutions as there are needs and groups’) and current trends in development (adopting the tasks of the traditional system) are to be observed there. One important form of alternative education is home schooling (cf. Evans 1973), where parents instruct their children at home with the permission of the state. If—as is quite often the case—these children meet up with other home students in (elementary) schools on single days of the week, there is a sliding transition to school cooperatives aiming at unconstrained, playful, spontaneous learning in open instruction. Other institutions specifically target problematic cases, so-called at-risk students, including not only chronic truants and low-performing students, but also talented nonconformers (in the broadest sense). A well-balanced program of drill and freedom is intended to render school attractive again, thus helping these students to enhance their performance. However, this development has also led to a reverse trend, a sort of back-to-basics shift to formal ‘three Rs’ schools, which specifically emphasize the basic skills, discipline, and respect for the authority of parents and teachers (cf., e.g., Deal and Nolan 1978). Similar trends are to be observed at the secondary level. The large number of schools-within-schools— smaller, less complex units running different experimental programs, and affiliated to larger state schools—is particularly notable. The spectrum also includes schools attempting to set themselves up as ‘just community schools’ based on the principles of Kohlberg’s developmental theory (cf. Kohlberg 1985). The so-called ‘schools without walls’ represent another form of alternative education. They not only offer a ‘normal’ curriculum (differing from that of the traditional school in terms of content and time), but give students the opportunity to learn in the adult world. This is intended to provide students with extrascholastic experience and to enable them to prove themselves in the world of work. Although this approach initially seemed very promising, experience has shown that it is in fact very hard to put into practice.
4. Recent Trends and Perspecties Over the past few decades, there have been signs of a kind of division of function and labor between the traditional and the alternative schools in the United States. This has been expressed in an increase in the state-controlled administration of alternative institutions. The fact that such large numbers of US alternative schools were founded at the beginning of the 1980s is probably related to this development. The radical free schools (e.g., Dennison 1969), which signaled the upheavals of the 1960s and certainly provided considerable impetus for subsequent developments, play a lesser role in today’s broader stream 13561
School (Alternatie Models): Ideas and Institutions of initiatives. The basic conceptions and expectations of alternative schooling have also been affected, resulting in the extremely short lifespan of some of these institutions. There is still a great deal of fluctuation, making it hard to make valid evaluations of these institutions (cf. the data in Mintz et al. 1994, p. 10ff.). Even the early approaches and experiments demonstrated the great weaknesses and dangers of such institutions—the cost of their structure-related advantages (cf. also Swidler 1979). Critics have reduced this to the polemic formula that precisely those principles and premises which allowed the pure ‘counterschool’ initiatives of the early period to emerge later threatened their ability to survive. Wherever the concept of freedom as a counterpole to the compulsory character of the traditional institutions went hand in hand with systematic shortcomings in the specification of learning targets, there was the risk that later conflicts be all the more severe due to the differing hopes and expectations of those involved (cf. Barth 1972, especially p. 108ff). When every schoolrelated action was dependent on the consensus of every single member of the school community, the school’s fate was put in the hands of changing minorities with the power of veto, all the more so when the small size of these private institutions kindled fears of sudden financial ruin. Allowing almost all of those concerned to take an active role in the decisionmaking process did not in fact ensure the equality of all participants—instead of increasing the capacity to solve problems, there was a growing risk of disappointment, and the necessary basic consensus was threatened by frequent conflicts and failures in minor matters. The firm and wide-ranging commitment of parents, extending well into issues of classroom practice, often resulted not so much in the enhancement as in the disturbance of school activities. Teachers were consequently made to feel insecure and were put on the defensive instead being provided with support and control. Attempts to simultaneously cope with the factual demands, organizational affairs, and personal standards inherent in these institutions—preferably without referring to preset guidelines—led to the systematic overexertion of those concerned, and the end of many ‘counterschools’ was marked by the burnout of the teaching staff (Dennison described this phenomenon as early as 1969). The unconditional openness with respect to the other members of the school community and the intimacy of the small group responsible for the school meant that factual disputes often escalated into personal trials. Moreover, school affairs were often overshadowed by secondary motives such as parental self-realization or the search for security. By integrating and promoting alternative schools, the state aims to increase the diversification of the educational spectrum, to enhance the overall efficiency of schools (which has been subject to criticism for decades; cf., e.g., Goodlad 1984), and to better meet 13562
the specific needs of certain problem groups in both school and society. The US programs of community schools, magnet schools and, more recently, charter schools provide evidence for this trend, and are looked upon with growing interest by other countries as possible models for the resolution of system-related problems. However, more detailed analyses of these models point to significant limitations in their problem-solving capacity, limitations which are ultimately rooted in the basic concepts of market and choice (cf. Lauder et al. 1999). The community schools program has already been thwarted by the US authorities’ desegregation measures: although these schools were directed at the advancement of members of the underprivileged minorities, most of whom are colored, they undermined the desired objective of racial integration. Magnet schools were supposed to be a free choice of school with a special educational program which acted like a magnet, attracting students from beyond the racial boundaries marked out in residential areas and school districts. They were intended to promote racial integration, which can scarcely be achieved through the compulsory measure of busing. A number of objections have been voiced against this school form. For example, it is argued that the enlargement of catchment areas weakens the link between the school and the parental home. Furthermore, although magnet schools may well establish a balanced relation between whites and blacks in their own student bodies, there are too few magnet schools to ensure social compensation on a large scale. Moreover, there is an imperceptible shift in their circle of addressees: instead of reaching the underprivileged, out-of-school colored population with their specific educational provision, magnet schools in fact seem to appeal to white parents and their children who would otherwise switch to private schools or the more privileged suburbs (cf. Smrekar and Goldring 1999). The charter schools, on the other hand, which in a way transfer the principles of the private school (i.e., the autonomy of schools and a free choice of school) to the state school system, are faced with comparable problems on a different level inasmuch as unforeseen side effects have been revealed in the social process. Despite the fact that the numbers of charter schools have increased considerably over the past few years, they are still far from ever being able to constitute the majority of state schools, belonging by definition to the optional domain of educational provision. The half-hearted legislative reservations perceptible in most of the federal states against according too many privileges to these schools threaten to render the entire charter program futile (cf. Hassel 1999).
Bibliography Barth R S 1972 Open Education and the American School. Agathon Press, New York
School as a Social System Carnie F, Tasker M, Large M (eds.) 1996 Freeing Education. Steps Towards Real Choice and Diersity in Schools. Hawthorn Press, Stroud, UK Deal T E, Nolan R R (eds.) 1978 Alternatie Schools. Ideologies, Realities, Guidelines. Nelson-Hall, Chicago, IL Dennison G 1969 The Lies of Children. The Story of the First Street School. Random House, New York Duke D L, Perry C 1978 Can alternative schools succeed where Benjamin Spock, Spiro Agnew, and B. F. Skinner have failed? Adolescence 13: 375–92 Evans T 1973 The School in the Home. Harper & Row, New York Goodlad J I A 1984 Place Called School: Prospects for the Future. McGraw Hill, New York Hassel B C 1999 The Charter School Challenge—Aoiding the Pitfalls, Fulfilling the Promise. Brookings Institution Press, Washington, DC Holzman L 1997 Schools for Growth. Radical Alternaties to Current Educational Models. Lawrence Erlbaum Associates, Mahwah, NJ Kohlberg L 1985 The just community approach to moral education in theory and practice. In: Berkowitz M W, Oser F (eds.) Moral Education: Theory and Application. Lawrence Erlbaum Associates, Hillsdale, NJ, pp. 27–87 Lauder H, Hughes D, Watson S, Waslander S, Thrupp M, Strathdee R, Simiyu I, Dupuis A, McGlinn J, Hamlin J 1999 Trading in Futures. Why Markets in Education Don’t Work. Open University Press, Buckingham, UK Mintz J, Solomon R, Solomon S, Muscat A (eds.) 1994 The Handbook of Alternatie Education. A Solomon Press Book. Macmillan, New York Ramseger J 1975 Gegenschulen. Radikale Reformschulen in der Praxis [Radical Reform Schools in Practice]. Klinkhardt, Bad Heilbrunn, Germany Semel S F, Sadovnik A R (eds.) 1999 ‘Schools of Tomorrow,’ Schools of Today. What Happened to Progressie Education History of School and Schooling. Peter Lang Verlag, New York, Vol. 8 Smrekar C, Goldring E 1999 School Choice in Urban America. Magnet Schools and the Pursuit of Equity. Teachers’ College, Columbia University, New York Swidler A 1979 Organization Without Authority. Dilemmas of Social Control in Free Schools. Harvard University Press, Cambridge, MA
A. Leschinsky
School as a Social System A ‘social system’ is a set of related elements that work together to attain a goal. Social scientists have frequently employed the concept of a social system to study the school. This has been done in three ways. First, the class has been portrayed as a social system; its elements include the teacher, the students, and formal and informal groups within the class. Second, the school itself has been viewed as a social system; its components are the administration, faculty, counselors, students, academic departments, the curriculum, the extra curriculum, and social networks and other subunits within the school. Third, the school has been
seen as one of a number of subunits in the larger social system of society (Gordon 1957, Loomis and Dyer 1976, Parsons 1961, Smelser 1988).
1. Viewing the School Class as a Social System The most familiar application of the social system model to schools is found in Parsons’ (1959) classic essay on the school class as a social system. Parsons focuses on the class, rather than the larger unit of the school, because he sees the class as the primary place where students learn and mature. According to Parsons, the two most important functions of the class are ‘socialization’ and ‘allocation.’ These functions define and motivate the academic and social processes that occur in the classroom and the interactions that occur among its various components.
1.1 Socialization Socialization is the process through which students internalize the kinds of commitments that they need to play a useful role in adult society. Although the family and community also socialize students, the length of time students spend in school makes the school class a major influence in the socialization process. Time spent in school during the students’ formative years provides ongoing opportunities for systematic, intensive socialization. During this time, students are expected to accept societal values and use them to guide their behavior. Parsons claims that several structural characteristics of the class facilitate the socialization of students. First, students assigned to the same class are fairly homogeneous with respect to age and social development. They are also somewhat homogeneous with respect to social class, since class composition is constrained by the socioeconomic characteristics of the neighborhood in which the school is located. The developmental and socioeconomic homogeneity of the students assists the teacher in transmitting values to the students who become, in turn, models of appropriate behavior for each other. Second, students in the same class are under the tutelage of one or a small number of adult teachers. The age difference between the teacher and students supports the teacher’s authority. The singular role of the teacher as the representative of adult society lends legitimacy to the teacher’s values. A third structural feature of the school class is the curriculum. Pupils are exposed to the same curriculum and assigned the same tasks. A shared curriculum and similar activities allows the teacher to reinforce cultural values and attitudes in multiple ways over the school year. A fourth structural feature of the class is its reward structure. Teachers generally establish a set of rewards 13563
School as a Social System and punishments governing student behavior in a classroom. Good behavior, defined by the teacher’s values, is rewarded, while disruptive actions are sanctioned. Students learn the rules of conduct that apply to social and work situations, and are motivated to obey these rules by the rewards or punishments their behaviors incur.
1.2 Allocation Allocation is the process of sorting individuals and assigning them to groups based on their ability or skills. In a school, allocation refers to the assigning of students to instructional units according to their abilities or achievement. The purpose of this sorting process is to prepare students to attain an occupational position in society commensurate with their capabilities (Gamoran et al. 2000, Hallinan 1994, Lynch 2000, Oakes 1994). Parsons defines achievement as excellence, relative to one’s peers, in meeting the expectations of the teacher. Achievement has two components: cognitive and moral. Teachers attempt to motivate students to achieve academically and to learn the skills needed to perform in adult society. They also teach students a moral code and a set of behaviors consistent with the values of adult society. Students are evaluated on each of these two components. Society may assign greater weight to one component than the other or may treat them as equally important. As with socialization, structural differentiation occurs through a process of rewarding students for excellence and punishing them for failure to meet teacher expectations. In the elementary school classroom, the criteria for excellence are not clearly differentiated across cognitive and moral dimensions. Teachers aim to develop both good citizenship and an achievement motivation in children. They also begin the process of classifying youth on the basis of their ability to achieve. While assigning students to classes by ability typically does not occur in elementary school, many elementary school teachers instruct their students in small, abilitybased groups within the classroom. Their purpose is to facilitate instruction and to prepare students for later channeling into ability-based classes. This initial sorting sensitizes students to the differential achievement that characterizes a class and to the way rewards are allocated for performance. At the secondary level, the cognitive and moral components of achievement are separable, and greater emphasis is placed on cognitive achievement. Parsons claims that students who achieve academic excellence in high school are better suited for technical roles in society, while those who excel in the moral or social sphere are more fit for socially oriented roles. The actual sorting of students in high school typically occurs by assigning students to classes for instruction based on their ability level. Since ability-grouped 13564
classes vary by student and teacher characteristics and by the curriculum, they represent different social systems within the school. The conceptual attraction of the social system model of the classroom has stimulated a body of research over the past few decades. The model depicts how the teacher, formal and informal groups, and individual students interact in the classroom and perform or cooperate with the functions of socialization and allocation. Examples of research based on this model include studies of teacher expectations, the quantity and quality of teacher–student interactions, gender and race effects on task-related and social interactions, peer influences and student friendship groups.
2. Viewing the School as a Social System Researchers are more likely to focus on the school, rather than the classroom, when utilizing the social system model to analyze education. Like the class, the school must perform the functions of socialization and allocation in order to play its assigned role in society. Viewing the school as a social system directs attention to how the parts of a school interact to carry out these functions.
2.1 Socialization For students to be socialized successfully in school, they must cooperate in their education. They are less likely to resist if they accept the authority of their teachers and believe that the school’s policies and practices are fair. When students agree with school policies and practices, they are unlikely to resist socialization. With student cooperation, a school is expected to attain its goal of graduating students who have internalized societal values and norms beyond those held by their families.
2.2 Allocation To attain its goal of allocation, a school must locate students on an achievement hierarchy, defining each person’s capability and aspirations. A critical factor in the school’s ability to allocate students is that both the family and school attach high value to achievement (Parsons 1959). When adults view student achievement as a high priority, students are more likely to internalize the motivation to achieve and to cooperate with adults as they orient students toward specific adult roles. When schools succeed in this effort, they match society’s human resources with its adult occupational structure. The primary mechanism for the allocation process is the organizational differentiation of students for
School as a Social System instruction by ability. Middle and high schools in the United States typically assign students to Academic, Vocational, and General tracks. Students assigned to the Academic track take college preparatory courses, those assigned to the Vocational track take skills and job-related courses, and students assigned to the General track are offered both low-level academic courses and skills courses. Track assignment is a major determinant of whether a student advances to college or enters the job market after graduation. Theoretical and empirical studies demonstrate the way various parts of a school interact to socialize students and prepare them for the labor market. Research on the school as a bureaucracy, on the formal or informal organization of the school, and on social networks within the school, illustrates this approach. These and other studies demonstrate the heuristic value of conceptualizing the school as a social system and show how this model has integrated a wide variety of studies about schooling.
3. Viewing the School as a Subsystem in Society Many social scientists have used the social system model to analyze the role of various institutions in society. Education is seen as one of society’s primary institutions, along with religion, the economy, and the judicial system. The aim of a social system approach to the study of schools in society is to ascertain how schooling enables society to achieve its goals. 3.1 Structural Functionalist Perspectie on Schools in Society A theoretical perspective that dominated early twentieth century societal analysis was structural functionalism (Durkheim 1956, Parsons 1951). A major premise of structural functionalism is that a society must perform a set of functions in order to survive. According to Parsons (1956), these functions are: obtaining and utilizing resources from the system’s external environment; setting goals for the system and generating the motivation and effort to attain these goals; regulating the units of the system to insure maintenance and to avoid destructive conflicts; and storing and distributing cultural symbols, ideas, and values. Schools perform these functions by socializing students to societal values, by providing a common culture and language, by enabling and encouraging competition, and by preparing students for the labor market. Structural functionalism purports that a social system must exist in a state of equilibrium or, if disrupted, must make adjustments to restore balance. If one social institution in society undergoes major change, interrelated social institutions are expected to accommodate this change and to bring society back to a stable state. For example, if the economy of a society
were to change, students would need to be socialized and allocated in different ways to support the new economic structure. The social system would then be restored to balance, though it would differ structurally from its original state. 3.1.1 Limitations of the structural functionalist model. The structural functionalist perspective has been challenged on many grounds. The most common criticism questions its assumption of systemic equilibrium. Critics claim that structural functionalism ignores the processes of social change internal to a social system. They argue that the relationship among the parts of a social system tends to change over time and that the social system that emerges as a result of these changes may differ fundamentally from the original system. Another criticism of structural functionalism regards its claim that a social system continues to operate as intended if all the parts of the system perform their functions. This assumption fails to take into account the possibility of external shocks to the system. An external shock could alter the pattern of interactions among the system’s parts in such a way as to disrupt its stability, leading to a basic restructuring of the system or, possibly, to its disintegration. Finally, the structural functionalist perspective has been criticized on ideological or political grounds. Critics claim that a structural functionalist perspective portrays the school as an institution that supports and perpetuates the existing social order and its stratification system. They argue that while schools reward students for ability and achievement, they also maintain the influence of ascribed characteristics on adult success. Structural functionalists fail to analyze the extent to which schools preserve a class-based society. Not only does structural functionalism ignore the way schools perpetuate the status quo; it also fails to explain how ascriptive characteristics mediate the effects of achievement after graduation. By linking occupational success to organizational characteristics of schools and academic achievement, structural functionalism fails to explain the poor fit that often exists between a student’s skills and abilities and the student’s future place in the labor market. Critics of structural functionalism argue that a job seeker can often negotiate with a prospective employer, and that this process allows an individual’s ascribed characteristics and social status to influence job placement. In short, structural functionalism is generally viewed as a static theory, which only partially describes the interactions in a school system, or between schools and the rest of society. Even when social change is incorporated into the model, the change is not seen as leading to a radical transformation of the school. As a result, the theory fails to depict the more dynamic and controversial dimensions of schooling. Nevertheless, structural functionalism has been a useful theoretical 13565
School as a Social System perspective to explain how schools and classrooms function as social systems under certain conditions in society.
3.2 Conflict Perspectie on the School as a Social System Conflict theory (Bowles and Gintis 1976, Collins 1971) is an alternative theoretical model for the analysis of the school as a social system. Conflict theory posits competition as the major force driving societal development. While structural functionalism views technological needs and economic growth as the major influences on society, conflict theory argues that competition for wealth and power is the primary state of society and social change is its inevitable result. According to conflict theory, ascribed characteristics are the basis of elite status. Collins (1971) argues that the continuing effort of the elite to exert control over lower status groups creates an ongoing struggle for power and prestige. Conflict theory specifies conditions under which subordinates are likely to resist the domination of superiors through noncompliance and resistance. Under conditions of economic hardship, political turmoil, or cultural conflict, nonelites are more likely to resist domination and to challenge the relationship that exists between education and occupation. Their discontent typically precipitates social change. Conflict theory not only explains the relationship between education and occupation; it also yields insights into the power struggles that occur between schools and other social groups in society. For example, during the student movement in the USA in the 1960s and 1970s, students resisted authority and the status quo. Tensions and disruptions continued until students were granted a greater voice in university affairs and in political life. Current controversies about school prayer, sex education, curriculum content, racial integration, and school vouchers are led by competing interest groups. These controversies are likely to lead to compromises that involve some redistribution of authority and power. The power struggle that occurs in society may also be observed in the school and in the classroom. In schools, students create their own value system that may be inconsistent with the school’s academic goals. The tests and grades administered by schools as a sorting mechanism may be viewed as a way to maintain the status quo, to enforce discipline and order, or to co-opt the most intelligent of the lower classes. In the classroom, students may challenge teacher authority and negotiate teacher power through resistance. By directly addressing the relationship between conflict and social change, ‘conflict theory’ supplements structural functionalism in explaining the behavior of schools in society and in predicting social change. Both theories point to important aspects of 13566
the dynamics of social systems. As stressed by structural functionalists, schools do socialize and allocate students, mostly by meritocratic criteria. But conflict theorists are correct in maintaining that nonmeritocratic factors also influence the allocation process. Structural functionalists are accurate in stating that students typically cooperate with teachers in the learning process, but conflict theorists recognize that some students resist authority and rebel. Further, conflict occurs in communities, schools, and classrooms, but it is not always class-based. Relying on the insights of both theories increases our understanding of schools in society.
4. Coleman’s Analysis of the School as a Social System In his last major work on social theory, Coleman (1990) argued explicitly that the aim of sociology is to explain the behavior of social systems. He pointed out that social systems might be studied in two ways. In the first approach, the system is the unit of analysis and either the behavior of a sample of social systems are studied or the behavior of a single social system is observed over time. Coleman’s (1961) study of the adolescent subculture follows this strategy. The second approach involves examining internal processes in a social system, including the relationships that exist among parts of the system. Research on the association between track level in high school and a student’s growth in academic achievement is an example of linking a sub-unit of a school to an individual. Studies of the transition from school to work illustrate the relationship between two subunits of society, education and the labor market. Social system analysis involves explaining three transitions: macro to micro, micro to micro, and micro to macro level transitions (Alexander et al. 1987, Collins 1981, Knorr-Cetina and Cicourel 1981). The macro to micro transition involves the effects of the social system itself on subunits (typically individuals) in the system. An analysis of the effects of ability group level on students’ educational aspirations is an example. The micro to micro transition links characteristics of subunits in the social system to the behavior of those subunits. An example is an analysis of the effects of gender on achievement. Finally, the micro to macro transition pertains to how the behavior of subunits in a social system influences the social system as a whole. Research on the effects of student achievement on the way the school organizes students for instruction is illustrative. Coleman argued that the micro to macro transition is the most difficult to analyze, because it requires specifying the interdependency among subunits or individuals in a social system. His theory of purposive action provides a way to model this transition. In general, Coleman’s insistence that the study of social
School Effectieness Research systems includes an explicit focus on the transitions that exist between macro level and micro level processes in the social system promises to draw greater attention to the social system approach to the study of schooling in society.
5. Conclusions Conceptualizing the school as a social system is a useful approach to the study of schools. The social system model has led to new theoretical insights about how education, as an institution, affects other societal institutions. It has also generated a significant body of empirical research that demonstrates the interdependence of sub-units in a school and of schools within larger organizational units and their effects on social outcomes. These studies have yielded a better understanding of the role schools play and the contribution they make to contemporary social life. See also: Coleman, James Samuel (1926–95); Educational Institutions and Society; Socialization in Adolescence; Socialization in Infancy and Childhood; Socialization, Sociology of; System: Social
Bibliography Alexander J, Giesen B, Munch R, Smelser N 1987 The Micromacro Link. University California Press, Berkeley, CA Bowles S, Gintis H 1976 Schooling in Capitalist America: Educational Reform and the Contradictions of Economic Life. Basic Books, New York Coleman J S 1961 The Adolescent Society. Free Press, New York Coleman J S 1990 The Foundations of Social Theory. Belknap Press of Harvard University Press, Cambridge, MA Collins R 1971 Functional and conflict theories of educational stratification. American Sociological Reiew 36: 1002–19 Collins R 1981 On the micro-foundations of macro-sociology. American Journal of Sociology 86: 984–1015 Durkheim E 1956 Education and Sociology. Free Press, New York Gamoran A, Secada W, Marrett C 2000 The organizational context of teaching and learning: Changing theoretical perspectives. In: Hallinan M (ed.) Handbook of the Sociology of Education. Kluwer\Academic\Plenum, New York, Chap. 2, pp. 37–64 Gordon C 1957 The Social System of the High School. Free Press, Glencoe, IL Hallinan M 1994 Tracking: From theory to practice. Sociology of Education 67(2): 79–84, 89–91 Knorr-Cetina K, Cicourel A (eds.) 1981 Adances in Social Theory and Methodology. Routledge and Kegan Paul, London Loomis C P, Dyer E 1976 Educational social systems. In: Lommis C P (ed.) Social Systems: The Study of Sociology. Schenkman, Cambridge, MA, Chap. 7 Lynch K 2000 Research and theory on equality and education. In: Hallinan M (ed.) Handbook of the Sociology of Education. Kluwer\Academic\Plenum, New York, Chap. 4, pp. 85–106 Oakes J 1994 More than misapplied technology: A normative and political response to Hallinan on tracking. Sociology of Education 67(2): 84–9, 91
Parsons T 1951 The Social System. Free Press, New York Parsons T 1956 Economy and Society: A Study in the Integration of Economic and Social Theory. Free Press, Glencoe, IL Parsons T 1959 The school class as a social system: Some of its functions in American society. Harard Educational Reiew 29: 297–318 Parsons T 1961 An outline of the social system. In: Parsons T, Shiks E, Naegele K, Pitts R (eds.) Theories of Society. Free Press, New York, Vol. 1 Smelser N 1998 Social structure. In: Smelser N (ed.) Handbook of Sociology. Sage, Newbury Park, CA, pp. 103–29
M. Hallinan
School Effectiveness Research 1. School Effectieness and School Effectieness Research In the most general sense ‘school effectiveness’ refers to the level of goal attainment of a school. Although average achievement scores in core subjects, established at the end of a fixed program are the most probable ‘school effects,’ alternative criteria like the responsiveness of the school to the community and the satisfaction of the teachers may also be considered. Assessment of school effects occurs in various types of applied contexts like the evaluation of school improvement programs or comparing schools for accountability purposes, by governments, municipalities, or individual schools. School effectiveness research attempts to deal with the causal aspects inherent in the effectiveness concept by means of scientific methods. Not only are assessment of school effects considered, but particularly the attribution of differences in school effects to malleable conditions. Usually, school effects are assessed in a comparative way, e.g., by comparing average achievement scores between schools. In order to determine the ‘net’ effect of malleable conditions, like the use of different teaching methods or a particular form of school management, achievement measures have to be adjusted for intake differences between schools. For this purpose student background characteristics like socioeconomic status, general scholastic aptitudes, or initial achievement in a subject are used as control variables. This type of statistical adjustment in research studies has an applied parallel in the strive for ‘fair comparisons’ between schools, known under the label of ‘value-added.’
2. Strands of Educational Effectieness Research School effectiveness research has developed as a gradual integration of several research traditions. The roots of current ‘state-of-the-art’ school effectiveness 13567
School Effectieness Research
Figure 1 A basic systems model of school functioning
research are sketched by briefly referring to each of these research traditions. The elementary design of school effectiveness research is the association of hypothetical effectivenessenhancing conditions of schooling and output measures, mostly student achievement. A basic model from systems theory, where the school is seen as a black box, within which processes or ‘throughput’ take place to transform inputs into outputs. The inclusion of an environmental or context dimension completes this model (see Fig. 1). The major task of school effectiveness research is to reveal the impact of relevant input characteristics on output and to ‘break open’ the black box in order to show which process or throughput factors ‘work,’ next to the impact of contextual conditions. Within the school it is helpful to distinguish a school and a classroom level and, accordingly, school organizational and instructional processes. Research tradition in educational effectiveness varies according to the emphasis that is put on the various antecedent conditions of educational outputs. These traditions also have a disciplinary basis. The
common denominator of the five areas of effectiveness research that will be distinguished is that in each case the elementary design of associating outputs or outcomes of schooling with antecedent conditions (inputs, processes, or contextual) applies. The following research areas or research traditions can be distinguished: (a) Research on equality of opportunities in education and the significance of the school in this. (b) Economic studies on education production functions. (c) The evaluation of compensatory programs. (d) Studies of unusually effective schools. (e) Studies on the effectiveness of teachers, classes, and instructional procedures. For a further discussion of each of these research traditions the reader is referred to Scheerens (1999). A schematic characterization of research orientation and disciplinary background is given in Table 1.
3. Integrated School Effectieness Research In recent school effectiveness studies these various approaches to educational effectiveness have become integrated. Integration was manifested in the conceptual modeling and the choice of variables. At the technical level multilevel analysis has contributed significantly to this development. In contributions to the conceptual modeling of school effectiveness, schools became depicted as a set of ‘nested layers’ (Purkey and Smith 1983), where the central assumption was that higher organizational levels facilitated effectiveness enhancing conditions at lower levels (Scheerens and Creemers 1989). In this way, a synthesis between production functions, instructional
Table 1 General characteristics of types of school effectiveness research Independent variable type (a) (un)equal opportunities (b) production functions (c) evaluation compensatory programs (d) effective schools (e) effective instruction
13568
Dependent variable type
Discipline
Main study type
socioeconomic status and IQ of pupils, material school characteristics material school characteristics specific curricula
attainment
sociology
survey
achievement level
economics
survey
achievement level
interdisciplinary pedagogy
quasi-experiment
‘process’ characteristics of schools characteristics of teachers, instruction, class organization
achivement level
interdisciplinar pedagogy
case-study
achievement level
educational psychology
experiment, observation
School Effectieness Research
Figure 2 An integrated model of school effectiveness (from Scheerens 1990)
effectiveness, and school effectiveness became possible by including the key variables from each tradition, each at the appropriate ‘layer’ or level of school functioning (the school environment, the level of school organization and management, the classroom level, and the level of the individual student). Conceptual models that were developed according to this integrative perspective are those by Scheerens (1990), Creemers (1994), and Stringfield and Slavin (1992). The Scheerens model is shown in Fig. 2. Exemplary cases of integrative, multilevel school effectiveness studies are those by Brandsma (1993), Sammons et al. (1995), and Grisay (1996). In Table 2 (cited from Scheerens and Bosker 1997) the results of three meta-analyses and a re-analysis of an international data set have been summarized and compared to results of more ‘qualitative’ review of the research evidence. The qualitative review was based on studies by Purkey and Smith (1983), Levine and Lezotte (1990), Scheerens (1992), and Sammons et al. (1995). The results concerning resource input variables are based on the re-analysis of Hanushek’s (1979) summary of results of production function studies that was carried out by Hedges et al. (1994). As stated before this re-analysis was criticized, particularly the unexpectedly large effect of per pupil expenditure.
The results on ‘aspects of structured teaching’ are taken from meta-analyses conducted by Fraser et al. (1987). The international analysis was based on the IEA Reading Literacy Study and carried out by Bosker (Scheerens and Bosker 1997, Chap. 7). The meta-analysis on school organizational factors, as well as the instructional conditions ‘opportunity to learn,’ ‘time on task,’ ‘homework,’ and ‘monitoring at classroom level,’ were carried out by Witziers and Bosker and published in Scheerens and Bosker (1997, Chap. 6). The number of studies that were used for these meta-analyses varied per variable, ranging from 14 to 38 studies in primary and lower secondary schools. The results in this summary of reviews and metaanalyses indicate that resource-input factors on average have a negligible effect, school factors have a small effect, while instructional variables have an average to large effect. The conclusion concerning resource-input factors should probably be modified and somewhat ‘nuanced,’ given the results of more recent studies referred to in the above, e.g., the results of recent studies concerning class-size reduction. There is an interesting difference between the relatively small effect size for the school level variables reported in the meta-analysis and the degree of certainty and consensus on the relevance of these factors in the more qualitative research reviews. It should be noted that the three blocks of variables depend on types of studies using different research methods. Education production function studies depend on statistics and administrative data from schools or higher administrative units, such as districts or states. School effectiveness studies focusing at school level factors are generally carried out as field studies and surveys, whereas studies on instructional effectiveness are generally based on experimental designs.
4. Foundational and Fundamental School Effectieness Studies Foundational school effectiveness studies refer to basic questions about the scope of the concept of school effectiveness. Can a school be called effective on the basis of achievement results measured only at the end of a period of schooling, or should such a school be expected to have high performance at all grade levels? Can school effectiveness be assessed by examining results in just one or two school subjects, or should all subject matter areas of the curriculum be taken into account? Also, shouldn’t one restrict the qualification of a school being effective to consistently high performance over a longer period of time, rather than a ‘one shot’ assessment at just one point in time? Fortunately all of these questions are amenable to empirical research. These type of studies that are associated with the consistency of school effects over grade-levels, teachers, subject-matter areas, and time 13569
School Effectieness Research Table 2 Reviews of the evidence from qualitative reviews, an international study, and research syntheses; coefficients are correlations with student achievement (the plus signs in the table refer to a positive assessment of the variable in question in the reviews). Qualitative reviews Resource input ariables: Pupil–teacher ratio Teacher training Teacher experience Teachers’ salaries Expenditure per pupil School organizational factors: Productive climate culture Achievement pressure for basic subjects Educational leadership Monitoring\evaluation Cooperation\consensus Parental involvement Staff development High expectations Orderly climate
j j j j j j j j j
Instructional conditions: Opportunity to learn Time on task\homework Monitoring at classroom level
j j j
International analyses k0.03 0.00
0.02 k0.03 0.04 k0.07 0.20
0.02 0.04 0.00 k0.02 0.08
0.14 0.05 0.15 0.03 0.13
0.20 0.04
0.11
0.15 0.00\k0.01 (n.s.) k0.01 (n.s.)
Aspects of structured teaching: —cooperative learning —feedback —reinforcement Differentiation\adaptive instruction
have been referred to as ‘foundational studies’ (Scheerens 1993) because they are aimed at resolving issues that bear upon the scope and ‘integrity’ of the concept of school effectiveness. A recent review of such foundational studies is given in Scheerens and Bosker (1997, Chap. 3). Their results concerning primary schools are presented in Table 3. Consistency is expressed in terms of the correlation between two different rank orderings of schools. Results are based on arithmetic and language achievement. The results summarized in Table 3 indicate that there is a reasonable consistency across cohorts and subjects, while the consistency across grades is only average. Results measured at the secondary level likewise show reasonably high stability coefficients (consistency across cohorts), and somewhat lower coefficients for stability across subjects (e.g., in a French study (Grisay 1996), coefficients based on value-added results were 0.42 for French language and 0.27 for mathematics). The average consistency between subjects at the secondary level was somewhat lower than in the case of primary schools (r about 13570
Research syntheses
0.09 0.19\0.06 0.11 (n.s.) 0.27 0.48 0.58 0.22
Table 3 Consistency of school effectiveness (primary level) Type of consistency
Average correlation
across time (stability) (1 or 2 years) across grades across subjects
r l 0.70 (0.34–0.87) r l 0.50 (0.20–0.69) r l 0.70 (0.59–0.83)
Source: Scheerens and Bosker (1997, Chap. 3)
0.50). This phenomenon can be explained by the fact that, at the secondary level, different teachers usually teach different subjects, so that inconsistency is partly due to variation between teachers. The few studies in which factor analysis was used to examine the size of a stable school factor relative to year specific and subject specific effects have shown results varying from a school factor explaining 70 percent of the subject and cohort specific (gross) school effects, to 25 percent. The picture that emerges from these studies on the stability and consistency of school effects is far from
School Effectieness Research being indifferentially favorable with respect to the unidimensionality of the school effects concept. Consistency is fair, when effects at the end of a period of schooling are examined over a relatively short time interval. When grade-level and subject matter area is brought into the picture, consistency coefficients tend to be lower, particularly when different teachers teach different grades or subjects. School effects are generally seen as teacher effects, especially at the secondary school level The message from these ‘foundational studies’ is that one should be careful not to overgeneralize the results of school effectiveness studies when only results in one or two subject matter areas at one point in time are measured. Another implication is that hypothetical antecedent conditions of effects are not only to be sought at the school organizational level, but also at the level of teaching and the teaching and learning process. Fundamental school effectiveness studies are theory- and model-driven studies. Bosker and Scheerens (1994) presented alternative causal specifications of the conceptual multilevel models referred to in an earlier section. These models attempt to grasp the nature of the relationships between e.g., schools and classroom conditions. For example, whether such relationships are additive, interactive, reciprocal, or form a causal chain. Other studies that have attempted to formalize these types of relationships are by Hofman (1995) and Creemers and Reezigt (1996). In general, it appeared to be difficult to establish the better ‘fit’ of one of the alternative model specifications. More complex models, based on the axioms of microeconomic theory, have been tested by de Vos (1998), making use of simulation techniques. So far, these studies are too few to draw general conclusions about the substantive outcomes; continuation of this line of study is quite interesting, however, also from a methodological point of view. From a substantive point of view educational effectiveness studies have indicated the relatively small effects of these conditions in developed countries, where provisions are at a uniformly relatively high level. At the same time the estimates of the impact of innate abilities and socioeconomic background characteristics—also when evaluated as contextual effects—seem to grow, as studies become more methodologically refined. Given the generally larger variation in both conditions and outcomes of schooling in developing countries—and the sometimes appallingly low levels of both—there is both societal and scientific relevance in studying school effectiveness in these settings.
5. The Future of School Effectieness Research From this article it could be concluded that school effectiveness research could be defined in a broad and
a narrow sense. In the broadest sense one could refer to all types of studies which relate school and classroom conditions to achievement outcomes, preferably after taking into account the effects of relevant student background variables. In a narrower sense, state-of-the-art integrative school effectiveness studies and foundational and fundamental studies could be seen as the core. Following the broader definition the future of school effectiveness studies is guaranteed, particularly in the sense of ‘applied’ studies, like cohort studies, large-scale effect studies carried out for accountability purposes, monitoring studies, and assessment studies. State-of-the-art, fundamental and foundational school effectiveness studies are a much more vulnerable species. One of the major difficulties is the organizational complexity and costs of the ‘state-ofthe-art types’ of study. Given the shortage of these kind of studies the more fundamental and foundational types of studies are likely to be dependent on the quality of data-sets that have been acquired for ‘applied purposes.’ Therefore the best guarantee for continued fundamental school effectiveness research lies in the enhanced research technical quality of applied studies. One example consists of quasi-experimental evaluation of carefully designed school improvement projects. Another important development is the use of IRT (Item Response Theory) modeling and ‘absolute’ standards in assessment programs. If school effects can be defined in terms of distance or closeness in average achievement to a national or even international standard, some of the interpretation weaknesses of comparative standards belong to the past. See also: Educational Assessment: Major Developments; Program Evaluation; School Improvement; School Management; School Outcomes: Cognitive Function, Achievements, Social Skills, and Values; Teacher Behavior and Student Outcomes
Bibliography Bosker R J, Scheerens J 1994 Alternative models of school effectiveness put to the test. In: Bosker R J, Creemers B P M, Scheerens J (eds.) Conceptual and Methodological Adances in Educational Effectieness Research, special issue of the International Journal of Educational Research 21: 159–80 Brandsma H P 1993 Basisschoolkenmerken en de Kwaliteit an het Onderwijs [Characteristics of primary schools and the quality of education]. RION, Groningen, The Netherlands Creemers B P M 1994 The Effectie Classroom. Cassell, London Creemers B P M, Reezigt G J 1996 School level conditions affecting the effectiveness of instruction. School Effectieness and School Improement 7: 197–228 Fraser B L, Walberg H J, Welch W W, Hattie J A 1987 Syntheses of educational productivity research. Special Issue of the International Journal of Educational Research 11
13571
School Effectieness Research Grisay A 1996 Eolution des acquis cognitifs et socio-affectifs des elees au cours des annees de college [Eolution of cognitie and socio-affectie outcomes during the years of secondary education]. Universite! de Lie' ge, Lie' ge, Belgium Hanushek E A 1979 Conceptual and empirical issues in the estimation of educational production functions. Journal of Human Resources 14: 351–88 Hedges L V, Laine R D, Greenwald R 1994 Does money matter? A meta-analysis of studies of the effects of differential school inputs on student outcomes. Educational Researcher 23: 5–14 Hofman W H A 1995 Cross-level relationships within effective schools. School Effectieness and School Improement 6: 146–74 Levine D U, Lezotte L W 1990 Unusually Effectie Schools: A Reiew and Analysis of Research and Practice. National Center for Effective Schools Research and Development, Madison, WI Purkey S C, Smith M S 1983 Effective schools: A review. The Elementary School Journal 83: 427–52 Sammons P, Hillman J, Mortimore P 1995 Key Characteristics of Effectie Schools: A Reiew of School Effectieness Research. OFSTED, London Scheerens J 1990 School effectiveness and the development of process indicators of school functioning. School Effectieness and School Improement 1: 61–80 Scheerens J 1992 Effectie Schooling. Research, Theory and Practice. Cassell, London Scheerens J 1993 Basic school effectiveness research: Items for a research agenda. School Effectieness and School Improement 4: 17–36 Scheerens J 1999 School Effectieness in Deeloped and Deeloping Countries: A Reiew of the Research Eidence. World Bank paper, Washington, DC Scheerens J, Bosker R J 1997 The Foundations of Educational Effectieness. Elsevier Science Ltd., Oxford, UK Scheerens J, Creemers B P M (eds.) 1989 Developments in school effectiveness research. Special themed issue of the International Journal of Educational Research 13 Stringfield S C, Slavin R E 1992 A hierarchical longitudinal model for elementary school effects. In: Creemers B P M, Reezigt G J (eds.) Ealuation of Effectieness. ICO-Publication 2. GION, Groningen, The Netherlands de Vos H 1998 Educational Effects: A Simulation-Based Analysis. University of Twente, Enschede, The Netherlands
J. Scheerens
tion important conditions and crucial characteristics of the processes. In the definition of these different concepts generally, and to distinguish school improvement from other concepts, at least three points should be taken into consideration. (a) The level in educational system where the change takes place: the school, the district, or the state. Ultimately, every change process has to reach the classroom and the level of the individual student. In order to make a distinction, educational change at national level is often called ‘systemic reform.’ The term ‘school improvement’ makes clear that the change takes place at school level. (b) In all definitions, distinctions should be made between, on the one hand, the process of educational change, the strategies used, and the characteristics of the change process, and, on the other, the outcomes of the change process. In everyday use, the term can refer to one of these aspects or even to a combination of them: for example, to strategy and the product of change. Fullan clarifies what he calls ‘the confusion between the terms change and progress’ (Fullan 1991, p. 4). Not every change is progress: the actual outcome of the change processes in terms of progress is not as important as the intended outcomes in the desired direction. (c) When it comes to the outcomes of change, there is often no well-defined concept of what the change is aimed at: the context of education, the inputs and processes, or, ultimately, the outcomes in terms of student achievement. The International School Improvement Project (ISIP) defines school improvement as a ‘a systematic sustained effort aimed at change in learning conditions and other related internal conditions in one or more schools, with the ultimate aim of accomplishing educational goals more effectively’ (Van Velzen et al. 1985). This definition specifies improvement as an innovation or planned change with specific means (change in learning and other internal conditions of the school) and specific goals (ultimately to accomplish educational goals more effectively).
School Improvement 2. Practice and Theory in School Improement 1. School Improement Different terms are used to describe change processes in schools and educational systems as a whole. They include: ‘educational change,’ ‘innovation,’ ‘reform,’ and development.’ These terms are quite often used in an interchangeable manner, and their meaning in the literature is frequently ambiguous. Many authors do not define them, but instead tend to trust everyday usage to convey the meaning of the words. Other authors define the concept by including in the defini13572
School improvement as a process has been demarcated into three global phases: the initiation, the implementation, and the continuation processes. Stakeholders of school improvement are more narrowly circumscribed and consist of students, teachers, principals, and parents at school level, and of local educational authorities, consultants, and the local community at local level. In exceptional cases, where it is a regional or national strategy to improve all schools, the stakeholders at these levels are also relevant.
School Improement School improvement is a very widespread phenomenon and a wide variety of improvement efforts can be found. For example, there is a lot of improvement going on which does not aim at enhancing student outcomes at all. These types of improvement focus, for instance, on the career development of teachers, on restructuring the organization of the school, on the way decisions are made, or on the relationships between schools and their clients. Sometimes restructuring takes place at classroom level. Changes in the writing practices of elementary schoolteachers, for example, have been described in great detail. Nevertheless, the actual impact of the changes on students and on student achievement is usually left out. Some schools practice improvement on their own and try to find their own solutions to their problems. Other schools have chosen to implement as accurately as possible improvement programs that have been developed elsewhere. This is called fidelity implementation. Often they do not consider alternative educational options. For example, they may opt for a specific program primarily because another school was satisfied with it (Stringfield 1995). Some schools are involved in improvement only because their government expects them to be (Hopkins et al. 1994). Depending on the extent to which educational policies are translated into clear outlines and contents, fidelity implementation is a more or less appropriate concept. When the educational policy is rather prescriptive with respect to curricular content and outcome levels, as it is in the UK (Hopkins et al. 1994), it is theoretically possible to check whether or not schools are implementing this policy. When the educational policy is rather open-ended, as is the Dutch policy on educational priorities (Van der Werf 1995) or the Dutch inclusion policy, it is virtually impossible to apply the notion of fidelity implementation. Only a small part of school improvement is based on research (Stringfield 1995). Innovations are hardly ever tested before they are implemented in educational practice, and an adequate ealuation of their impact is rare. The same applies to the occurrence of experimental or quasi-experimental designs in improvement projects. Some school improvers have preferred forms of action research instead of research-based experiments. Sometimes projects based on changing only a couple of factors report great successes. It may be that changes in just a few important areas can alter a whole system, but the question is whether these changes— often cases of educational mass hysteria—will stand the test of time. In education, the margins for change are generally small, because of the impact of noneducational factors such as the individual characteristics of students. Evaluations are often not carried out satisfactorily or are performed after a timespan that is too short. School improvement projects have shown that it takes time to design, develop, implement, and evaluate changes in schools—more time than is often available.
Because of inadequate evaluations, questions about the causation of effects, extremely important for the school-effectiveness knowledge base, cannot be answered. School improvement, therefore, should consistently pursue assessment of its results, pay attention to its failures and try to learn from these, and limit its goals to prevent confusion of cause and effect (Hopkins et al. 1999). In school improvement, the orientation towards educational practice and policy-making is emphasized. Although, in educational practice schools and classrooms can be found that succeed much better than others, theories cannot be based on exemplary practice only. Insofar as theoretical notions have been developed, they have not yet been empirically and systematically tested. The typologies of school cultures (Hargreaves 1995), for example, have not been studied in educational practice, and their effects on the success of school improvement are as yet unknown. Moreover, their relationships to the first criterion for school improvement (enhancing student outcomes) are not always very clear. The same holds for the factors that are supposed to be important in different stages of educational change, outlined by Fullan (1991), and for his ideas about the essential elements in educational change at classroom level (beliefs, curriculum materials, and teaching approaches). Even though these ideas are derived from school improvement practice, their importance and their potential effects are not accounted for in detail, and have not yet been studied in research. A contribution to theory development is the generic framework for school improvement provided by Hopkins (1996). In this framework, three major components are depicted: educational givens, a strategic dimension, and a capacity-building dimension. Educational givens cannot be changed easily. Givens can be external to the school (such as an external impetus for change) and internal (such as the school’s background, organization, and values). The strategic dimension refers to the competency of a school to set its priorities for further development, to define student learning goals and teacher development, and to choose a strategy to achieve these goals successfully. The capacity-building dimension refers to the need to focus on conditions for classroom practice and for school development during the various stages of improvement. Finally, the school culture has a central place in the framework. Changes in the school culture will support teaching–learning processes which will, in turn, improve student outcomes (Hopkins 1996). Despite the obvious gaps in theory development and testing in the field of school improvement, there are already some elements of a knowledge base. By trying to improve schools, knowledge on the implementation of classroom and school effectiveness factors in educational practice has become available. This has provided the possibility of studying, to 13573
School Improement different degrees, the influence of factors and variables on educational outcomes (Stoll and Fink 1994).
3. The Link Between School Improement and School Effectieness School effectiveness and school improvement conjoin because of their mutual interest, although their actual relationship may be very complicated (Reynolds et al. 1993, Creemers and Reezigt 1997). The major aim in the field of school effectiveness was always to link theory and research on the one hand, and practice and policy-making on the other. School improvement can be fostered by a knowledge base covering what works in education that can be applied in educational practice. The combination of theory, research, and development is not new in education. Almost all movements start out to make knowledge useful for educational practice and policy-making, or state their goal in terms of supplementing policy practice with a knowledge base supplied by theory and research from a cyclical point of view. The next step is to use practical knowledge for further advances in theory and research. In this way, research and improvement can have a relationship as a surplus benefit for both. School effectiveness has led to major shifts in educational policy in many countries by emphasizing the accountability of schools and the responsibility of educators to provide all children with opportunities for high achievement, thereby enhancing the need for school improvement (Mortimore 1991). School effectiveness has pointed to the need for school improvement, in particular by focusing on alterable school factors. School improvement projects were necessary to find out how schools could become more effective. These projects were often supposed to implement effective school factors in educational practice (Scheerens 1992) and, in doing so, could yield useful feedback for school effectiveness. School improvement might point to inaccurate conceptions of effectiveness, such as the notion of linearity or one-dimensionality (Hargreaves 1995). In addition, school improvement might give more insight into the strategies for changing schools successfully in the direction of effectiveness. The relatively short history of school effectiveness and improvement shows some successes of this linkage. Research results are being used in educational practice, sometimes with good results. School improvement findings are sometimes used as input for new research. Renihan and Renihan state that ‘the effective schools research has paid off, if for no other reason than that it has been the catalyst for school improvement efforts’ (Renihan and Renihan 1989, p. 365). Most authors, however, are more skeptical (Reynolds et al. 1993). Fullan states that school effectiveness ‘has mostly focused on narrow educational goals, and 13574
the research itself tells us almost nothing about how an effective school got that way and if it stayed effective’ (Fullan 1991, p. 22). Stoll and Fink (1992) think that school effectiveness should have done more to make clear how schools can become effective. According to Mortimore (1991), a lot of improvement efforts have failed because research results were not translated adequately into guidelines for educational practice. Changes were sometimes forced upon a school, and when the results were disappointing the principals and teachers were blamed. Teddlie and Roberts (1993) suggest that effectiveness and improvement representatives do not cooperate automatically, but tend to see each other as competitors. Links between school effectiveness and school improvement were stronger in some countries than in others (Reynolds 1996). In the early years of school effectiveness, links were strong in the USA and never quite disappeared there. Many districts have implemented effective schools programs in recent years, but research in the field has decreased at the same time, and, because of this, school improvement is sometimes considered ‘a remarkable example of (…) over-use of a limited research base’ (Stringfield 1995, p. 70). Reynolds et al. (2000) came to the conclusion based on the analyses of Dutch, British, and North American initiatives that the following principles are fundamental to a successful merger of school effectiveness and school improvement: (a) a focus on teaching, learning, and the classroom level; (b) use of data for decision making; (c) a focus on pupil outcomes; (d) addressing schools’ internal conditions; (e) enhanced consistency (through implementation of ‘reliable’ programs); and (f) pulling levers to affect all levels, both within and beyond the school. Recently, several projects (Stoll et al. 1996, Hill 1998) have started to integrate school effectiveness and school improvement. They form successful examples of the concept of sustained interactivity. These projects all share a clear definition of the problem that should be overcome, in terms of student outcomes and classroom strategies, to enhance these outcomes within the context of the school. Often, the outcomes are clearly specified for one school subject or elements of a school subject, such as comprehensive reading. The content of the projects is a balanced mix of the effectiveness knowledge base and the concepts from school improvement. The projects have detailed designs, both for the implementation of school improvement and for evaluation in terms of empirical research. By means of a research component integrated into the projects right from the start, it is possible to test effectiveness hypotheses, and to evaluate improvement outcomes at the same time. The use of control groups is essential in this respect, and various projects now incorporate control groups or choose to compare
School Improement their results to norm groups on the basis of nationwide tests. Also, many projects are longitudinal in their designs. Although most integrated projects have started recently, some of them have been running for almost a decade now, and they have been disseminated to various educational contexts. Therefore, long-term effects and context-specific effects can easily be tracked by means of follow-up measurement. An additional feature of projects which last for several subsequent years is the possibility of testing the effectiveness of school improvement strategies and changing strategies whenever necessary. The Halton’s Effective Schools Project illustrates this feature clearly (Stoll and Fink 1996, Stoll et al. 1996). The project started in Canada in 1986 with the intention of implementing British effectiveness knowledge in Halton district schools. It soon became clear that the effectiveness knowledge base in itself would not automatically lead to changes in educational practice. Over the years, the project paid a lot of attention to questions on the processes of change in schools. It focuses on the planning process in schools, the teaching and learning processes in classrooms, and staff development. Successful changes turned out to be enhanced by a collaborative school culture, a shared vision of what the school will stand for in the future, and a climate in which decisions are made in a transparent way. Based on their Halton experiences, Stoll and Fink (1996) have developed a conceptual model which links school effectiveness and school improvement through the school development planning process. The model blends the school effectiveness knowledge base with knowledge about change processes in schools. The school development planning process is at the center of the model. The process is considered to be multilayered. Two outer layers comprise invitational leadership, and continuing conditions and cultural norms. The inner cycle layer is formed by the ongoing planning cycle of assessment, planning, implementation, and evaluation of educational processes. The two central core layers refer to a strong focus on the teaching–learning processes and the curriculum, and the students in the school. The school development planning process is influenced by the context of the school and foundations such as research findings, and, in turn, influences intermediate outcomes at teacher and school levels, as well as student outcomes. Finally, the planning process is influenced by several partners (external agencies, educational networks). In the UK, 66 percent of improvement programs now pursue goals which fit into the school effectiveness tradition of student outcomes (Reynolds et al. 1996). However, it is still not clear whether real improvements will always occur. The IQEA project (Improving the Quality of Education for All, Hopkins et al. 1994) started in 1991 as a staff development project. Gradually, a focus on classroom improvement and its effects on student achievement took over. The project does
not stop with the implementation of priorities for development, but also pays explicit attention to conditions that will sustain the changes. These are staff development, involvement, leadership, coordination, inquiry and reflection, and collaborative planning (Stoll et al. 1996). In addition, when the classroom became the center of attention, the project specified the classroom-level conditions that are necessary for effective teaching and student achievement. These are authentic relationships, rules and boundaries, teacher’s repertoire, reflection on teaching, resources and preparation, and pedagogic partnerships. Other promising projects in the UK are the Lewisham School Improvement Project and the Hammersmith and Fulham LEA Project. Both projects actively try to enhance student achievement by means of school effectiveness knowledge, and both projects are cooperating with a research institute (Reynolds et al. 1996). Recent theories about school improvement stress the self-regulation of schools. Self-regulation assumes target setting and embodies the behavioral concept of mechanisms, feedback, and reinforcement. This selfregulatory approach to school improvement can be combined with the analogous self-regulatory feedback loops in educational effectiveness, which are again further elaborated in the upward spiraling school development planning process (Stoll and Fink 1996, Hopkins 1995). The analysis of school improvement efforts over a period of time resulted in a distinction of several types of schools (effective vs. ineffective and improving vs. declining). Stoll and Fink (1998) and Hopkins et al. (1999) give descriptions of those schools and the features of the change processes going on. The declining effective school (the ‘cruising school’) has received particular attention because it points at the difficulties of keeping educational quality at the same level (Fink 1999). See also: Educational Evaluation: Overview; Educational Innovation, Management of; Educational Institutions and Society; Educational Policy: Comparative Perspective; Educational Policy, Mechanisms for the Development of; Educational Research and School Reform; School Administration as a Field of Inquiry; School Effectiveness Research; School Management
Bibliography Creemers B P M, Reezigt G J 1997 School effectiveness and school improvement: Sustaining links. School Effectieness and School Improement 8: 396–429 Fink D 1999 The attrition of change: A study of change and continuity. School Effectieness and School Improement 10: 269–95 Fullan M G 1991 The New Meaning of Educational Change. Teachers College Press, New York
13575
School Improement Hargreaves D H 1995 School culture, school effectiveness and school improvement. School Effectieness and School Improement 6: 23–46 Hill P W 1998 Shaking the foundations: Research driven school reform. School Effectieness and School Improement 9: 419–36 Hopkins D 1995 Towards effective school improvement. School Effectieness and School Improement 6: 265–74 Hopkins D 1996 Towards a theory for school improvement. In: Gray J, Reynolds D, Fitz-Gibbon C, Jesson D (eds.) Merging Traditions: The Future of Research on School Effectieness and School Improement. Cassell, London, pp. 30–51 Hopkins D, Ainscow M, West M. 1994 School Improement in an Era of Change. Teachers College Press, New York Hopkins D, Reynolds D, Gray J 1999 Moving on and moving up: Confronting the complexities of school improvement in the improving schools project. Educational Research and Ealuation 5: 22–40 Mortimore P 1991 School effectiveness research: Which way at the crossroads? School Effectieness and School Improement 2: 213–29 Renihan F I, Renihan P J 1989 School improvement: Second generation issues and strategies. In: Creemers B, Peters T, Reynolds D (eds.) School Effectieness and School Improement. Swets and Zeitlinger, Amsterdam\Lisse, pp. 365–77 Reynolds D 1996 County reports from Australia, the United Kingdom, and the United States of America. Introduction and overview. School Effectieness and School Improement 7: 111–13 Reynolds D, Hopkins D, Stoll L 1993 Linking school effectiveness knowledge and school improvement practice: Towards a synergy. School Effectieness and School Improement 4: 37–58 Reynolds D, Sammons P, Stoll L, Barber M, Hillman J 1996 School effectiveness and school improvement in the United Kingdom. School Effectieness and School Improement 7: 133–58 Reynolds D, Teddlie C, Hopkins D, Stringfield S 2000 School effectiveness and school improvement. In: Teddlie C, Reynolds D (eds.) The International Handbook of School Effectieness Research. Falmer Press, London, pp. 206–31 Scheerens J 1992 Effectie Schooling: Research, Theory and Practice. Cassell, London Stoll L, Fink D 1992 Effecting school change: The Halton approach. School Effectieness and School Improement 3: 19–42 Stoll L, Fink D 1994 School effectiveness and school improvement: Voices from the field. School Effectieness and School Improement 5: 149–78 Stoll L, Fink D 1996 Changing our Schools: Linking School Effectieness and School Improement. Open University Press, Buckingham, UK Stoll L, Fink D 1998 The cruising school: The unidentified ineffective school. In: Stoll L, Myers K (eds.) No Quick Fixes: Perspecties on Schools in Difficulty. Falmer Press, London, pp. 189–206 Stoll L, Reynolds D, Creemers B, Hopkins D 1996 Merging school effectiveness and school improvement: Practical examples. In: Reynolds D, Creemers B, Bollen R, Hopkins D, Stoll L, Lagerweij N (eds.) Making Good Schools, Linking School Effectieness and School Improement. Routledge, London, pp. 113–47 Stringfield S 1995 Attempting to enhance students’ learning through innovative programs: The case for schools evolving into High Reliability Organizations. School Effectieness and School Improement 6: 67–96
13576
Teddlie C, Roberts S P 1993 More Clearly Defining the Field: A Surey of Subtopics in School Effectieness Research. Paper presented at the Annual Meeting of the American Educational Research Association, Atlanta, GA Van der Werf M P C 1995 The Educational Priority Policy in The Netherlands: Content, Implementation and Outcomes. SVO, Den Haag, The Netherlands Van Velzen W G, Miles M B, Ekholm M, Hameyer U, Robin D 1985 Making School Improement Work: A Conceptual Guide to Practice. ACCO, Leuven, Belgium
B. P. M. Creemers
School Learning for Transfer Most of what is taught in school is expected to affect learning and performance that transcend the mastery of those subjects: one learns addition to prepare the grounds for the study of multiplication, science to understand the surrounding physical and natural world, and history to gain a sense of identity and a deeper understanding of current events. Such expectations concern the transfer of what is learned in school in one domain to other realms of learning and activity, a spillover of learning that functions like a ripple effect. Thus, the study of topic A is supposed to facilitate the study of B, make it easier, faster, or better understood relative to when B is learned without the learning of A preceding it. Two approaches to the study of transfer have dominated the field since the beginning of the twentieth century: transfer as a function of similar elements between A and B, and as a function of the understanding of general principles that are transferable to a variety of situations. The history of research on transfer shows how these two major principles underlie the development of the field, even when other basic assumptions about learning and thinking have recently been challenged.
1. Two Basic Approaches The expectation for transfer has as long a history as institutional learning itself. Plato argued that the study of abstract reasoning assists the solution of daily problems. Similarly, the debates about Talmudic and Biblical texts in ancient times were argued to ‘sharpen minds.’ The scientific study of transfer has a long history as well; it dates back to the beginning of the twentieth century with the studies by Thorndike and Woodworth (1901) in which the idea that Latin and other taxing school subjects developed one’s general ‘faculties of mind’ was challenged. The expected transfer was not found; mastery of Latin, Greek, or geometry did not facilitate either performance or efficiency in learning other subjects. Thorndike’s
School Learning for Transfer findings led to his formulation of the theory of ‘identical elements’: transfer from A to B is likely to occur the more elements are common to both. It followed that transfer can take place only in the relatively rare case when there is clear and apparent similarity between the constituent elements of the learned topics. Thus, one would have to learn a host of independent issues and procedures as no transfer in the absence of identical elements should be expected. Thorndike’s theory was countered by a more cognitively-oriented and Gestalt-flavored alternative that emphasized the meaningful conceptual understanding of underlying principles which then could be thoughtfully applied to new situations. The more general the learned principle, the farther could it be transferred to new instances. Judd (1908) had his subjects learn to shoot at targets submerged in water after having learned principles of light refraction in water. Having learned these principles, subjects were much better at hitting the targets than those who have only practiced the activity. As we shall see below, the two approaches—transfer as a matter of common elements and transfer as a matter of higher order cognitive processes of understanding and abstraction—although having undergone profound modifications and developments over the years, continue to this day to dominate the field.
2. Paucity of Findings Research on transfer as well as practitioners’ expectations for transfer are characterized by an ongoing pendulum-like oscillation between the two approaches. Such oscillation is not so much the result of evidence that supports both approaches as it is a function of the paucity of findings clearly supporting either one of them. Indeed, one of the hallmarks of the field is the great discrepancy between the expectation for transfer from school learning to the acquisition of new subjects or to the handling of daily events and the findings from controlled transfer studies. The latter often do not yield the transfer findings one would have expected. A typical case is the study by Pea and Kurland (1984) who have found that children learning LOGO programming showed no advantage in their ability to plan over children who did not master LOGO. Transfer, when at all found in carefully carried out studies, appears to be highly specific, as when a particular procedure or principle is applied in a new situation which bears great and apparent (often analogical) similarity to the learning material, and when the process is accompanied by such facilitation as coaching, specific guidance and cueing (see Bransford and Schwartz (1999, Cox (1997, for detailed reviews). Such findings would seem to support Thorndike’s theory and pessimism about transfer, as summarized by Detterman (1993): ‘In short, from
studies that claim to show transfer and that don’t show transfer, there is no evidence to contradict Thorndike’s general conclusions: Transfer is rare, and its likelihood of occurrence is directly related to the similarity between the two situations’ (p.15).
3. Transfer on the Rebound Whereas Thorndike’s pessimism seemed to have won the day for a while, the research tradition of Judd (1908), with its emphasis on comprehension and general (nonspecific) transfer gradually showed its strength. As research of this tradition tended to show, when relevant principles or strategies are mindfully attended to, or better yet—either abstracted by learners and\or metacognitively monitored—transfer to a variety of instances can be obtained even in the absence of apparent common elements. At the same time, other research on the transfer of skills, ostensibly still in a tradition more similar to that originated by Thorndike (Thorndike and Woodworth 1901), has also yielded positive results. Thus, for example, Singley and Anderson (1989) have shown that when training to near automaticity of discrete skill components is carried out, transfer from one task (e.g., a line editor) to another (text editor) can be virtually perfect. Also more field-based research found support for impressive transfer from the study of universitytaught disciplines: Nisbett et al. (1993) have shown that the study of psychology, and also to a lesser extent medicine (but not law or chemistry), improve students’ abilities to apply statistical and methodological rules of inference to both scientific and daily problems.
4. The High and the Low Roads to Transfer The renewed success of the two lines of research, often addressing higher order cognitions and information processing exercized by multiple activities, suggests that transfer may not be a unitary process as the two approaches differ in important ways. Observing these differences, Salomon and Perkins (1989) have developed a theory to account for the possibility that transfer takes either one of two routes (or a combination thereof), described as the high road and the low road of transfer. The low road reflects to an extent the Thorndikian line, and more recently that of Anderson’s skill-acquisition and transfer theory. It is taken when skills, behaviors, or action tendencies are repeatedly practiced in a variety of situations until they are mastered to near-automaticity and are quite effortlessly applied to situations whose resemblance to the learning situations is apparent and easily perceived. Learning to drive and learning to read are two cases in point as is transfer from one text editor to another as studied by Singely and Anderson (1989). Other candidates for low road transfer are attitudes, 13577
School Learning for Transfer cognitive styles, dispositions, and belief systems the application of which to new situations is rarely a mindful process. In contrast, the high road to transfer is characterized by the process of mindful abstraction of knowledge elements that afford logical abstraction: principles, rules, concepts, procedures, and the like. It is this mindfully abstracted, decontextualized idea (‘ethnic oppression may lead to revolt’) that becomes a candidate for transfer from one particular instance (the Austro-Hungarian Empire) to another (Russia and Cechenia). Simon (1980) commented in this respect that ‘To secure substantial transfer of skill acquired in the environment of one task, learners need to be made explicitly aware of these skills, abstracted from their specific task content’ (p. 82, emphasis added). There is ample evidence to support the role of the high road in obtaining transfer. One of the studies by Gick and Holyoak (1983) illustrates this point. Participants who were given two stories and were asked to write a summary of how the two stories resembled each other (that is, their common moral), showed a 91 percent transfer to the solution of another story, relative to 30 percent transfer of a non-summary group. Bassok (1990), showed that mastering algebraic abstractions (plus examples) allowed students to view physics problems as particular instances to which the more abstract algebraic operations could be applied. Physics, on the other hand, is too particular and thus students do not expect and do not recognize any possible relationship between it and algebraic operations. Research also directs attention to the role played by self-regulation and metacognitions in the process of mindful abstraction and transfer via the high road. Following the studies and review of Campione et al. (1982), Salomon et al. (1989) have shown that students interacting with a semi-intelligent computerized Reading Partner that provides metacognitive-like guidance, tend to internalize that guidance and transfer it to new reading as well as to writing situations. The high road\low road theory sheds light on the many failures of obtaining transfer in controlled studies. Close examination of such studies suggests that in many cases neither the low road of repeated practice nor the high road of mindful abstraction was taken. Not enough time is allocated for practice for the former, and not enough attention is given for mindful abstraction for the latter. As a consequence, neither near automatic transfer on the basis of easily recognized common elements can be attained, nor farther transfer on the basis of metacognitively guided mindful abstraction.
5. New Approaches Research on transfer, until recently, did not challenge the basic paradigm and conception of transfer as a 13578
process involving change of one’s performance on a new task as a result of his or her prior performance on a preceding and different task. Such a conception of transfer has recently become challenged on the basis of new theories of learning, the role of activity, and the place of cognitions therein. These challenges can be arranged along a dimension ranging from the least to the more radical, relative to traditional notions of transfer. Bransford and Schwartz (1999), coming from a mainstream tradition of cognitive science applied to instructional issues, argue that the current ways of demonstrating transfer are more appropriate for studying only full-blown expertise. According to them, ‘There are no opportunities for … [students] to demonstrate their abilities to learn to solve new problems by seeking help from other resources such as text or colleagues or by trying things out, receiving feedback, and getting opportunities to revise’ (p. 68). They recommend replacing the typical transfer task which they call ‘sequestered problem solving’ (SPS) by a conception of transfer as ‘preparation for future learning’ (PFL). Thus, one would measure transfer not by having students apply previously acquired knowledge or skill to new situations, demonstrating either knowledge how or knowledge that something, but rather by demonstrating knowledge with—thinking, perceiving, and judging with whatever knowledge and tools are available even if that knowledge is not consciously recalled. In other words, transfer would be expected to become demonstrated by way of ‘people’s abilities to learn new information and relate their learning to previous experiences’ (Bransford and Schwartz (1999, p. 69). An illustration of the above can be seen in a study in which college and fifth grade students did not differ in their utilization of previous knowledge and educational experiences for the creation of a statewide plan to recover the bald eagle. This is a typical SPS way of measuring transfer, suggesting in this case that previous educational experiences did not transfer to the handling of the new problem. However, when the students were asked to generate questions that would need to be studied in preparation for the creation of a recovery plan (applying a PFL approach), striking differences were found between the two age groups in favor of the college students. Focusing on different issues, the two groups showed how, knowingly or not, previous learning has facilitated the generation of the list of questions to be asked. The use of tools and information resources would also show how previous learning of skills and knowledge becomes actively involved in the process of solving a new problem. While Bransford and Schwartz (1999) continue to adhere to the traditional conception of learning as knowledge acquisition and to cognitions as general and transferable tools of the mind, others have developed an alternative, socio-cultural approach according to which neither the mind and its content
School Learning for Transfer (e.g., arithmetic algorithms) should be treated as a neutral toolbox ready for application wherever suitable, nor should knowledge-in-use and social context of activity be taken as two independent entities. Learning is a matter of participation in a community of learners, thus knowledge is not a noun denoting possession but rather a verb denoting the active process of knowing-as-construction within a social context of actiity. One underlying assumption is that learner, activity, and content become united through culturally constructed tools. A second assumption is that learning is highly situated in a particular activity context of participation, and its outcomes are ‘becoming better attuned to constraints and affordances of activity systems so that the learner’s contribution to the interaction is more successful’ (Greeno 1997, p. 12). Seen in this light, the traditional cognitiist concern with the transferability of acquired knowledge across tasks becomes, from the newer situatie perspective, an issue concerned with the task and the ‘consistency or inconsistency of patterns of participatory processes across situations’ (Greeno (1997, p. 12). An important practical implication of this approach is that one would need to take into consideration the kinds of participatory activities to which school-based learning might be expected to transfer, rather than expect decontextualized school materials to transfer all on its own to other tasks, situations, and activities. Students are likely to restructure new situations to fit their previous practices and thus make the old available for transfer to the new. Illustrating this approach to learning and cognition, Saxe (1991), showed that Brazilian children’s mathematical experience as street vendors affected their math learning when later on they became enrolled in school, and vice versa. ‘[I]n recurring activities like practices of candy selling and schooling—contexts in which knowledge is well learned and in which similar problems emerge on a repeated basis—transfer is a protracted process. In their repeated efforts to solve practice-linked problems, individuals attempt to structure problem contexts in coherent and more adequate ways’ (p. 179). The new approaches to transfer clearly deviate from the traditional views. They suggest new ways of measuring transfer, and more radically—challenge traditional assumptions by treating knowledge, skill, task, activity, and context as one participatory activity. By so doing, they emphasize the uniqueness of each activity setting. An important aspect of this approach is its ability to incorporate within the idea of participatory activity motivational and dispositional elements also. These elements have so far been neglected by much research on transfer, although they may be crucial for students’ choice to treat each situation as different, or as allowing students to exercise familiar patterns of participation. Still, it appears that despite these important novelties, the principles of transfer as a function of similarity between situations (or activities) and of depth of
understanding general principles (e.g., of useful participation in a community of learners) have not yet been replaced but only redefined. See also: Competencies and Key Competencies: Educational Perspective; Learning Theories and Educational Paradigms; Learning to Learn; Metacognitive Development: Educational Implications; School Effectiveness Research; Situated Learning: Out of School and in the Classroom; Thorndike, Edward Lee (1874–1949); Transfer of Learning, Cognitive Psychology of
Bibliography Bassok M 1990 Transfer to domain-specific problem solving procedures. Journal of Experimental Psychology: Learning Memory and Cognition 6: 522–33 Bransford J, Schwartz D 1999 Rethinking transfer: A simple proposal with multiple implications. Reiew of Educational Research 24: 61–100 Campione J C, Brown A L, Ferrara R A 1982 Mental retardation and intelligence: Contributions from research with retarded children. Intelligence 2: 279–304 Cox B D 1997 A rediscovery of the active learner in adaptive contexts: A developmental-historical analysis of transfer of training. Educational Psychologist 32: 41–55 Detterman D R 1993 The case for prosecution: Transfer as an epiphenomenon. In: Detterman D K, Sternberg R J (eds.) Transfer on Trial: Intelligence, Cognition and Instruction. Ablex, Norwood, NJ Gick M L, Holyoak K J 1983 Schema induction and analogical transfer. Cognitie Psychology 15: 1–38 Greeno J G 1997 Response: On claims that answer the wrong questions. Educational Researcher 26: 5–17 Judd C H 1908 The relation of special training to general intelligence. Educational Reiew 36: 28–42 Nisbett R E, Fong G T, Lehman D R, Cheng P W 1993 Teaching reasoning. In: Nisbett R E (ed.) Rules for Reasoning. Erlbaum, Hillsdale, NJ Pea R D, Kurland D M 1984 On the cognitive effects of learning computer programming. New Ideas in Psychology 2: 37–68 Salomon G, Globerson T, Guterman E 1989 The computer as a zone of proximal development: Internalizing reading-related metacognitions from a Reading Partner. Journal of Educational Psychology 81: 620–7 Salomon G, Perkins D N 1989 Rocky roads to transfer: Rethinking mechanisms of a neglected phenomenon. Educational Psychology 24: 113–42 Saxe G B 1991 Culture and Cognitie Deelopment: Studies in Mathematical Understanding. Erlbaum, Hillsdale, NJ Simon H A 1980 Problem solving and education. In: Tuma D T, Reif R (eds.) Problem Soling and Education: Issues in Teaching and Research. Erlbaum, Hillsdale, NJ Singley M K, Anderson J R 1989 The Transfer of Cognitie Skill. Harvard University Press, Cambridge, MA Thorndike E L, Woodworth R S 1901 The influence of improvement in one mental function upon the efficacy of other functions. Psychological Reiew 8: 247–61
G. Salomon Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
13579
ISBN: 0-08-043076-7
School Management
School Management School management is in considerable disarray, if not turmoil, and is likely to remain so in the first decade of the twenty-first century. This is partly a reflection of sweeping transformations in society, mirrored in developments in education generally and in schools in particular. It also reflects failure to effect a powerful link between school management as a field of study and school management as a field of practice, so that each informs and influences the other. More fundamental is the concern that, in each instance, management is not connected to learning as well as it should be.
1. Definition of School Management Disarray is evident in the matter of definition. In some nations, notably the USA, administration is used to describe a range of functions performed by the most senior people in an organization, with management a more limited, sometimes routine set of tasks, often performed by those in supporting roles. A conceptual difficulty relates to the distinction between leadership and management. The work of Kotter (1990) is helpful in resolving the issue. Leadership is concerned with change and entails the three broad processes of establishing direction, aligning people, and motivating and inspiring. Management is concerned with the achievement of outcomes, and the three broad processes of planning and budgeting, organizing and staffing, and controlling and problemsolving. These difficulties are reflected in the role ambiguity of those who hold senior positions in schools, notably principals or head teachers. There is a generally held view that their work entails both leadership and management. They should certainly be leaders. In some instances managers will support them. There is concern when they act solely as managers. Kotter (1990) offers a contingent view that is as helpful for schools as it is for organizations in general. He contends that the emphasis and balance in leadership and management are contingent on two variables: the amount of change that is required and the complexity of the operation. Complex schools in a turbulent environment require high levels of leadership and management. Schools that are relatively simple but are faced with major change require leadership more than management. Schools that are complex in stable circumstances require management more than leadership. Simple schools in stable settings may require little of each.
2. School Management as a Field of Practice There are three important developments in school management that are evident in almost every nation 13580
(Caldwell and Spinks 1998). Schools and systems of schools vary in the extent to which each has unfolded. The first is the shift of authority and responsibility to schools in systems of public education. Centrally determined frameworks of curriculum, standards, and accountabilities are still in force but schools have a considerable degree of freedom in how they will meet expectations. This development is known as schoolbased management or local management or selfmanagement, and is frequently accompanied by the creation of school councils, school boards, or other structures for school-based decision-making. Comprehensive reform along these lines is most evident in Australia, the UK, Canada, New Zealand and the USA, but most other nations have begun the process or are planning it. Reasons for decentralization vary but are generally consistent with the changing view of the role of government that gathered momentum in the final decade of the twentieth century. As applied to education, this view holds that governments should be mainly concerned with setting direction, providing resources, and holding schools to account for outcomes. They should also provide schools with the authority and responsibility to determine the particular ways they will meet the needs of their students, within a centrally determined framework. The second development arises from higher expectations for schools as governments realise that the knowledge and skill of their people are now the chief resource if nations are to succeed in a global economy. There is recognition of the high social cost of failure and the high economic cost of providing a world-class system of education. As a result, there is unprecedented concern for outcomes, and governments have worked individually and in concert to introduce systems of testing at various stages of primary and secondary schooling to monitor outcomes and provide a basis for target setting in bringing about improvement. In several countries, notably the UK and some parts of the USA, rankings of schools are published in newspapers. Comparisons of national performance are now possible through projects such as the Third International Mathematics and Science Study (Martin and Kelly 1996). The third development is change in the nature of schooling itself. Driven to a large extent by advances in information and communications technology, much of the learning that in the past could only occur in the classroom can now occur at anytime and anywhere there is access to a networked computer. Students and their teachers have access to a vast amount of information. Interactive multi-media have enriched and individualized learning on a remarkable scale. This development is not uniform across all schools or across all nations, and disparities are a matter of concern. There is growing anxiety on the part of government that structures for the delivery of public education that
School Management have survived for a century or more may no longer be adequate for the task. Charter schools made their appearance in Canada and the USA in the 1990s, and the number increased very rapidly, even though they are still a small fraction of the total number of schools. Charter schools receive funding from the public purse but are otherwise independent, operating outside the constraints of a school district. A schools-for-profit movement gathered momentum around the same time, with public funding enhanced by private investment through the offering of shares on the stock exchange, as instanced by the pioneering Edison Schools. In the UK, the Blair Government privatized the support services of several local education authorities and established a framework for public–private partnerships in the management of publicly owned schools. The number of private schools is increasing in most countries, reflecting growing affluence and loss of faith in the public sector. A contentious policy, gaining ground in a number of countries, is the provision of public funds (often described as ‘vouchers’) allowing parents to access private schools when public schools do not meet expectations. Taken together, these developments have created conditions that call for high levels of leadership and management at the school level. Schools are more complex and are experiencing greater change than ever before. A particular challenge is to prepare, select, place, appraise, and reward those who serve, and there is growing concern in some countries about the quality and quantity of those who seek appointment. Some governments have created or are planning new institutions to address this concern, illustrated by the establishment in England in 2000 of the National College for School Leadership. There is evidence of a loss of faith in the university as an institution to shape and inform approaches to school management, and to prepare those who will serve as leaders and managers in the future.
3. School Management as a Field of Study School management has emerged relatively recently as a sub-field of study within the broader field of educational management (administration). There were few formal programs and little research on the phenomenon until the 1950s. Prior to that, approaches to the management of schools paralleled those in industry. Indeed, it was generally recognized that management in education was largely a mirror image of industry, with the creation of large centralized and bureaucratized systems of public education. The knowledge base was drawn from the industrial sector. Departments of educational administration made their appearance in universities in the US in the midtwentieth century and soon proliferated. They were
established in the UK soon after, with many other nations following suit, drawing on the US and, to a lesser extent, the UK for their intellectual underpinnings. The first quarter-century, from the early1950s to about the mid-1970s was characterized by efforts to build theory in educational management, modeling to a large degree the discipline approach of the behavioral sciences. In the USA, Educational Administration Quarterly was an attempt to provide for education what Administratie Science Quarterly has done for administration (management). In the UK, a seminal work edited by two pioneers in the field had the title Educational Administration and the Social Sciences (Baron and Taylor 1969). Separate sub-disciplines were created in fields such as economics, finance, governance, human relations, industrial relations, law, planning, policy, and politics. Large numbers of students enrolled, and departments of educational administration became a major source of revenue for schools of education. The theory movement in educational management was the subject of powerful attack in the mid-1970s, notably by Canadian scholar T. B. Greenfield (1975). However, it was not until the late 1980s that a systematic response was mounted by the University Council for Educational Administration in the 1987 report of the National Commission on Excellence in Educational Administration (NCEE 1987). The critique was largely directed at efforts to build a scientific theory of educational management; the separation of management from learning and teaching; and fundamental considerations of context, culture, ethics, meaning, and values. Blueprints for reform were drawn up but the field was still in ferment at the turn of the twenty-first century (see Murphy and Forsyth (1999) for a review of efforts to reform the field). Evers and Lakomski have developed a new conceptual framework (‘naturalistic coherentism’) in an effort to bring coherence to the fields of educational administration and management. Lakomski (2001) argues that leadership ought to be replaced by better conceptions of organizational learning. On a more pragmatic level, the critique of educational management as a field of study lies in its failure to impact practice on a large scale, and to assist in a timely manner the implementation of reform and the resolution of concerns. A promising approach to unifying the field is to place learning at the heart of the effort. This means that policy interest in the improvement of student outcomes should be the driver of this ‘quest for a center,’ as US scholar Joseph Murphy (1999) has described it. Murphy classified the different stages of development in the ‘profession’ of educational administration, as summarized in Table 1. Much momentum was gained in the 1990s with the establishment of the International Congress for School Effectiveness and Improvement (ICSEI), which made its mission the bringing together of policy-makers, practitioners, and researchers. The periodical School 13581
School Management Table 1 Rethinking the center for the profession of educational administration (Murphy 1999) Time frame 1820–1900 Ideology 1901–1945 Prescription 1946–1985 Behavioral science 1986– Dialectic
Center of gravity
Foundation
Engine
Philosophy Management Social sciences School improvement
Religion Administrative functions Academic disciplines Education
Values Practical knowledge Theoretical knowledge Applied knowledge
Effectieness and School Improement quickly established itself as a leading international journal. The integration of policy, practice, and research to achieve school improvement is becoming increasingly evident, illustrated in approaches to literacy in the early elementary ( primary years). The examples of Australia and the UK are interesting in this regard. Levels of literacy were considered by governments in both nations to be unacceptably low. Implementation of approaches in the Early Literacy Research Project led to dramatic improvement in the state of Victoria, Australia. Concepts such as ‘whole school design’ emerged from this integration. Hill and Cre! vola (1999) formulated a ‘general design for improving learning outcomes’ that proved helpful in parts of Australia and the United States. It has nine elements: beliefs and understanding; leadership and coordination; standards and targets; monitoring and assessment; classroom teaching strategies; professional learning teams; school and classroom organization; intervention and special assistance; and home, school, and community partnerships. The implications are clear as far as educational management is concerned: there must be high levels of knowledge about learning and teaching and how to create designs and formulate strategies that bring about improvement. This contributes to a particularly demanding role where decentralization has occurred, as is increasingly the case in most nations.
4. The Future of School Management The picture that emerged at the dawn of the twentyfirst century is one in which school management does not stand in isolation. Leadership and management, as conceptualized by Kotter (1990) are both important, and their focus must be on learning, integrated in the notion of creating a coherent and comprehensive whole-school design. Those who serve in positions of leadership are operating in a milieu of high community expectations in which knowledge and skill are the chief determinants of a nation’s success in a global economy. Rapidly increasing costs of achieving 13582
expectations mean a capacity to work in new arrangements, including innovative public–private partnerships. High expectations mean a commitment to core values such as access, equity, and choice. All of this unfolds in an environment in which the nature of schooling is undergoing fundamental change. There are major implications for universities and other providers of programs for the preparation and professional development of school leaders. An interesting development in Victoria (Australia) and the UK is that governments are turning to the private sector to specify requirements and offer a major part of training programs. Governments in both places commissioned the Hay Group to assist in this regard. The target population in the UK is serving principals, numbering about 25,000. The program has been very highly rated by both principals and employers. The most recent contribution of the Hay Group has been the specification for Victoria of 13 competencies and capabilities for school leaders. The focus is school improvement and there are four components: (a) driving school improvement ( passion for teaching and learning, taking initiative, achieving focus); (b) delivering through people (leading the school community, holding people accountable, supporting others, maximizing school capability); (c) building commitment (contextual know-how, management of self, influencing others); and (d) creating an educational vision (analytic thinking, big picture thinking, gathering information). These developments suggest that training will be a shared responsibility in the future, with a limited role for universities unless they can form effective partnerships with the profession itself and work in association with private providers. Universities will have a crucial role to play, especially in research, but again that is likely to be in a range of strategic alliances if it is to be valued. The role of the university in philosophy and critique of school management will be more highly valued to the extent that the disjunction is minimized (between study and practice and between management and learning). A fundamental issue that awaits resolution is the extent to which those who work in school management are not weighed down by the high expectations. It was noted at the outset that applications for appointment are declining in quality and quantity. There is even a
School Outcomes: Cognitie Function, Achieements, Social Skills, and Values concern in well-funded schools in the private sector. Part of the solution is to infuse leadership and management throughout the school and generally build the capacity of all to work in the new environment. The concept of ‘knowledge management’ is likely to be become more important, so that accounting for the resources of a school will extend beyond financial and social capital to include intellectual capital in the form of the knowledge and skill of staff. This may go only part of the way, given that schools are in many respects still operating with a design for an earlier era. Innovation in design must extend to the creation of a work place that is engaging for all that work within it. What should be retained and what should be abandoned in current designs is the challenge for the first decade of the twenty-first century. Drucker’s concept of ‘organized abandonment’ (Drucker 1999) is as important for schools as it is for other organizations. See also: Educational Innovation, Management of; Management: General
Bibliography Baron G, Taylor W (eds.) 1969 Educational Administration and the Social Sciences. Athlone, London Caldwell B J, Spinks J M 1998 Beyond the Self-managing School. Falmer, London Drucker P F 1999 Management Challenges for the 21st Century. Butterworth Heinemann, Oxford Greenfield T B 1975 Theory about organizations: A new perspective and implications for schools. In: Hughes M G (ed.) Administering Education: International Challenge. Athlone, London, pp. 71–99 Hill P W, Cre! vola C A 1999 The role of standards in educational reform for the 21st century. In: Marsh D D (ed.) Preparing our Schools for the 21st Century. ASCD Yearbook 1999. Association for Supervision and Curriculum Development, Alexandria, VA Kotter J P 1990 A Force for Change: How Leadership Differs from Management. Free Press, New York Lakomski G 2001 Management Without Leadership: A Naturalistic Approach Towards Improing Organizational Practice. Pergamon\Elsevier, Oxford, UK Martin M O, Kelly D L (eds.) 1996 Third International Mathematics and Science Study. Technical Report. Vol. 1: Design and Deelopment. Boston College, Chestnut Hill, MA Murphy J 1999 The quest for a center: Notes on the state of the profession of educational leadership. Invited paper at the Annual Meeting of the American Educational Research Association, Montreal, April Murphy J, Forsyth P (eds.) 1999 Educational Administration: A Decade of Reform. Corwin Press, Newberry Park, CA National Commission on Excellence in Educational Administration 1987 Leaders for America’s Schools. University Council for Educational Administration, Tempe, AZ
B. J. Caldwell
School Outcomes: Cognitive Function, Achievements, Social Skills, and Values Achievement in US schools is as high as it was in the mid-1970s, despite the fact that increasingly more poor and minority students are remaining in schools longer than ever. Unlike in the early 1960s, where researchers debated whether there was any impact of schooling on students’ learning, it is now accepted that schools (and especially teachers) impact student achievement. Classroom learning and school achievement broadly defined continue to grow in important ways (see, for example, Biddle et al. 1997). However, much of the recent scholarship on classroom processes has been conceptual in nature, and the associated empirical research (if any) typically involves only a few classrooms. There is considerable advocacy for certain types of classroom environments—classroom environments that are notably different from traditional instruction—but solid empirical support to show the effects of new instructional models is largely absent. Here we discuss the need for new school-based research that addresses curriculum and instructional issues in order to advance theoretical understanding of student learning in school settings. We argue that it is no longer possible to discuss ‘successful schooling’ on the basis of subject-matter outcomes alone. Effective schools research and policy discussions must also include the measurement of non-subject-matter outcomes of schooling.
1. Status of Achieement in Schools The state of achievement in US schools is widely debated and incorporates at least three major camps. The first contends that current school performance is lower than that of US youth from the mid-1970s and that of international contemporaries. Indeed, several publications supported by the federal government have corroborated the assertion that US schools are failing, including Nation at Risk (National Commission for Excellence in Education 1983) and Prisoners of Time (National Education Commission on Time and Learning 1994). A second camp argues that students’ performance today is as good as it was in the mid-1970s and that international comparisons are uninterpretable (Berliner and Biddle 1995). These and other researchers note that the stability of achievement has been maintained at a time when US schools are increasingly diverse and serving more minority students and those from poor homes than ever before. This camp also: (a) notes that reports of average school achievement are often highly misleading, because students’ performance varies markedly between schools; and (b) has concerns about the performance of minority students. 13583
School Outcomes: Cognitie Function, Achieements, Social Skills, and Values Importantly, there is documentable evidence not only of the stability of student achievement over time, but also of real progress. For example, in less than a decade, the number of students, including minority students, taking advanced placement tests in high schools for college credit doubled such that onefifth of US students in 1996 entered college with credits earned from testing (Good and Braden 2000). A third camp agrees with the second position, but asserts that traditional achievement standards are no longer applicable to a changing society—schools must address different, higher subject-matter standards. Although we accept the premise that some curriculum changes are probably necessary to accommodate the knowledge needs of a changing society, we have two major problems with advocates for new, higher standards. First, there has been no specification of what these new skills are and, typically, if examples are even provided, the rhetoric of higher standards essentially translates into moving content normally taught later (college calculus) to appear sooner (in high school). This position is specious, because it lacks sound theoretical and empirical support. Second, extant data suggest that achievement of students in states with ‘higher standards’ is lower than that of students in states with purported lower standards (Camilli and Firestone 1999)! We align our arguments with advocates of the second camp, and we note that there is extensive evidence to support this position—student achievement scores are stable in the face of growing student diversity (Berliner and Biddle 1995, Good and Braden 2000). However, despite strong empirical data to show that student performance has not declined, we believe there is much room for improvement. For example, schools in urban settings need to improve student subject-matter performance, and all schools need to further enhance student non-subject-matter achievement. The academic accomplishments of students in today’s schools are difficult to understand for various reasons. Here, we review several reasons why the achievement of US students is problematic.
2. Which Test is Appropriate? For one example of the difficulty of using extant assessment instruments to compare students within the USA or across cultures, one only need note the discrepancy between the measures. For example, as Bracey and Resnick (1998) report, in 1996 the National Assessment of Educational Progress (NAEP) mathematics performance indicated that only 18 percent of US fourth graders were proficient, only 2 percent were advanced, and 32 percent were below basic. In contrast, US fourth graders performed well above the average scores of the 26 countries that participated in the international TIMSS study at the fourth-grade 13584
level. Which test is the best descriptor? Are US students above average in fourth-grade mathematics, or not? This issue of ‘which test’ is embedded in US testing, as well. For example, in New York state there has been a raging debate about the appropriateness of a state-mandated test (i.e., in relation to other measures, such as advanced placement tests).
3. Problems with Curriculum, Theory, and Test Alignment Other factors also limit efforts to relate the implemented curriculum with student achievement. For example, in mathematics, a current ‘theoretical’ recommendation is to avoid instruction that is a mile wide and an inch deep (i.e., in terms of the breadth and depth of curriculum concepts covered). Those who advise teachers about this problem apparently do not worry about the opposite problem—creating a curriculum that is four inches wide and a mile deep! Aligning standardized achievement tests with curriculum implementation is vital for meaningful interpretation of student scores. Appropriately, achievement tests should assess the philosophical orientation of the curriculum. Then, other tests and research can be used to assess the utility of a curriculum for various purposes (i.e., do various curricula have different effects on long-term memory of the curriculum taught or on students’ ability to transfer ideas learned in a curriculum to solve new problems?). Finally, curriculum writers often do not understand the learning theory that undergirds particular instructional practices or goals. Some curriculum theorists do not appear to understand that behavioral models are powerful when dealing with factual and conceptual knowledge (and constructivist models are exceedingly weak in this area). Also, they often fail to understand that information-processing models (cognitive science) are more powerful than behavioral or constructivist principles at addressing wide horizontal transfer of academic concepts and authentic problem solving. In addition, in applying constructivist principles, many curriculum experts do not even recognize the differences among constructivist theories. Despite the difficulties associated with assessment, instruments, and the curricula, and instructional alignment with achievement outcomes, there are data to suggest that teachers and schools can make a difference in student achievement.
4. Effectie Teachers The question that was problematic is the mid1970s—‘Do teachers make a difference in student learning?’—can now be answered with a definite ‘yes’ (Weinert and Helmke 1995). There are clear data to
School Outcomes: Cognitie Function, Achieements, Social Skills, and Values illustrate that some teachers notably outperform other teachers in helping similar students to learn the type of material historically measured on standardized achievement tests. It is not only possible to identify effective teachers on this useful but limited measure of student learning (e.g., see Good et al. 1983), but also research has illustrated that the practices of ‘successful’ teachers could be taught to other teachers. As one instance of this research base, the Missouri Mathematics Program was successful in improving students’ performance in mathematics. However, a series of research studies in this program found that teachers who were involved in the training program implemented the intended program uneenly. Also, not surprisingly, higher levels of program implementation were associated with higher levels of student mathematical performance. Importantly, some of the ‘resistance’ to implementing the Missouri Mathematics Program was due to teachers’ beliefs about mathematics and how it should be taught (Good et al. 1983). Teachers’ beliefs are a key, but often overlooked, point in school reform. And, as Randi and Corno (1997) found, failure to take teachers’ ideas into account lowers the impact of many reform interventions. Many aspects of teaching and learning are still problematic. Although there are numerous plausible case examples of teachers who impact students’ ability to think and to apply knowledge, there is still debate about how teachers obtain these ‘higher order’ outcomes and whether other teachers can be educated in ways that allow them to achieve comparable effects on their students. Unfortunately, research on teachers’ roles in helping students to develop thinking and problem-solving skills has in recent times been limited to case-study research. Further, research that helps teachers to expand—and assess—their capacity for impacting students’ thinking abilities has been inert.
5. Effectie Schools There is some evidence that schools (serving similar populations of students) have more effect on student achievement (as measured by conventional standardized achievement tests) than do other schools (e.g., Good and Weinstein 1986). However, unlike the correlational and experimental research on teacher effects, the early research base describing ‘effective schools’ was sparse and questionable. Teddlie and Stringfield’s (1993) longitudinal study in Louisiana has provided strong evidence that schools make a difference. They found that schools serving similar populations of students had different ‘climates’ and that these school differences were associated with differential student achievement. In general, the study confirmed many of the arguments that school effects researchers had made previously. Interestingly, in their
eight-year study, about one-half of the schools retained their relative effectiveness during the time period (stability rates were similar for schools that had initially been defined as effective or ineffective). More work on this issue is warranted. There still have not been successful and replicated studies to show that factors associated with ‘effective’ schools can be implemented in other schools in ways that enhance student achievement. Although there are notable examples that schools can be transformed, there is comparatively little research to see if this knowledge can be used to improve other schools.
6. Comprehensie School Programs In recent years, intervention using components of previous teacher and school effects research and emerging ideas (e.g., cooperative student groups) have been combined in ‘comprehensive school programs’ designed to transform all aspects of schooling at the same time (including governance, structure, instruction, home schooling, communication, curriculum, and evaluation!). Although these broad interventions offer potential, it should be noted that the theoretical assumptions and congruence of these programs have not been assessed. Some programs appear to have a serious misalignment between program components. For example, as McCaslin and Good (1992) noted, some schools emphasize a behavioral approach to classroom management (‘do as you are told’) and a constructivist approach to the curriculum (‘think and argue your conception’). Recent school intervention efforts have focused upon the impact of broad programs on student achievement. Typically, research includes no observational data, and thus there is no evidence of what parts of the program are implemented and\or whether the program parts that are presented represent a theoretically integrated program or a ‘patchwork quilt’ of separate and conflicting program parts. Further, the more complex and comprehensive the intervention model, the more likely teachers will alter program parts (often permanently). These ‘teachable’ moments go unnoticed by reformers. Perhaps unsurprisingly, there is growing consensus that such programs have not had consistent, demonstrable effects on students’ achievement (Slavin 1999).
7. Little Support for New Research In the past decade, policy leaders have tended to agree with advocates of camp one about the effectiveness of schools. Those in camp one (and who believe that schools are failing radically) have been able to convince policymakers to invest widely in charter and voucher plans, even in the absence of any convincing 13585
School Outcomes: Cognitie Function, Achieements, Social Skills, and Values pilot data. To these advocates, both the problem and solution are evident. Those in the second camp (stable student achievement over time) have been ‘represented’ by researchers attempting to implement comprehensive school reform. Unfortunately, as noted above, such research attempts fail to show the effects of program implementation. This group of researchers does not question the knowledge base sufficiently and hence fails to develop research designs that could improve its conceptions of ‘good practice.’ Those who fall into the third camp (new societal requirements mandating new processes and outcomes of schooling) are prepared to reject extant theory and research on the basis that it is outdated and irrelevant. Indeed, it is not only that monolithic state-mandated standardized tests have gotten in the way of scientific research, but also that there is a growing willingness of professional groups to advocate desirable teaching processes on theoretical grounds alone. Prestigious groups, such as the National Council of Teachers of Mathematics (NCTM), have encouraged curriculum development without advocacy for research on instruction and have encouraged teachers to teach less and to use more student group work. Although it is perhaps the case that at an aggregated level, teachers as a group should teach less, the real issue is about when teachers should teach and when students should explore alone or with other students in small or large groups.
8. Controllable and Uncontrollable Influences on Student Learning It has become better understood in the last 25 years of the twentieth century that schools can influence students’ learning independent of home background. However, it is also better understood that the effects of schools are mediated by other factors. Bracey and Resnick (1998) argue that many factors impacting students are within the control of the school district, including: aligning textbooks with standards; aligning the broader curriculum with standards; employing high-quality teachers and providing them with appropriate professional development; and having modern textbooks and laboratory equipment. Although these factors are controllable in theory, it should be noted that some school districts have considerably fewer resources to work with than do other districts. It seems unconscionable that all schools do not receive these controllable resources, which are known to impact achievement. Bracey and Resnick (1998) have argued that there are various out-of-school factors that can influence student achievement, including: teenage pregnancy rates; percentage of female-headed households; poverty rate; parental and maternal educational levels;thepercentageofstudentsforwhomEnglishisnot their native language; number of violent incidents per 13586
year; and number of annual policy visits\disciplinary actions that occur at particular schools. These opportunity-to-learn variables are seen as conditions that impact student achievement but are largely beyond the direct influence of the school system. Accordingly, various groups have noted that achievement problems of students from low-income families need to be addressed by other social agencies. Clearly, poor achievement in schools is often outside the control of individual schools, but not outside the influence of the broader society. If a society chooses to hold individual schools ‘accountable’ for achievement, it is incumbent on that society to spread its wealth in ways that address factors that individual schools cannot address.
9. Improing Performance: Aligning Theory and Research Massive investments of public funds in student testing have occurred over the last two decades of the twentieth century and, as noted, some policymakers have called for even more testing and for raising extant standards. The research for higher standards has led to the development of tests in some states that are designed, absurdly, so that most students fail them. Oddly, despite this massive investment in testing, no information has been obtained to describe adequately what students learn in schools and how conditions can be changed to further enhance student learning. We believe that school reform efforts need to bring theory and empirical research together. It is time to stop advocacy from prestigious groups that do not support their reform calls in the form of theory and data. There is a growing sense of the importance of instructional balance in order to scaffold the learning needs of students in complex classroom settings (Good and Brophy 2000, Grouws and Cebulla 2000). However, these assertions demand more research examination in various contexts. In contrast, this advice is currently not echoed by professional groups. These groups continue to overemphasize (in our opinion) problem solving and application without concomitant attention to the development of conceptual knowledge. The ability to solve problems also requires the ability to recognize and find problems as well as the computational and conceptual skills necessary for addressing problems. Simply put, there can be too little or too much attention to concepts, facts, or problem solving.
10. Non-subject-matter Outcomes of Schooling Americans have always argued that their students should not simply be ‘academic nerds,’ but that students should also be participants in broader society (have jobs, participate in drama or sports, etc.).
School Outcomes: Cognitie Function, Achieements, Social Skills, and Values Increasingly, parents and citizens have expressed interest in schools being more responsive to the nonsubject-matter needs of youth. Often, this argument is expressed only in terms of improving achievement—if children feel secure, they will achieve better. But increasingly, the assertion is made that some outcomes of schooling are important whether or not they impact subject-matter learning. Despite the considerable advocacy of policy leaders for higher standards, many citizens are pushing hard for students to learn responsibility. For example, when asked what first comes to their minds when Americans think about today’s teenagers, 67 percent described them as ‘rude, irresponsible, and wild,’ and 41 percent reported that they did not think teens had enough to do and that they were not being taught the value of hard work (schools are not doing their ‘job’) (Farkas et al. 1997). This idea is echoed in the 1999 Phi Delta Kappan\Gallup poll, where 43 percent of parents reported that the main emphasis of schools should be on students’ academic skills, and 47 percent of parents said they believe that the main emphasis of schools should be helping students to learn to take responsibility. And earlier in a 1996 Phi Delta Kappan\Gallup poll, parents indicated they would prefer their children to receive bad grades and be active in school activities rather than making an excellent grade and not participating in extracurricular activities (60 percent to 20 percent). It is clear that parents and citizens, reacting to a perception of adolescents’ moral decline, believe that schools should play a greater role in teaching non-subject-matter content, such as responsibility and civic behavior. Although citizens want schools to provide students with more non-subject-matter knowledge, there are problems with the systematic addition of this ‘content’ to the curriculum. One problematic issue is the difficulty in finding consensus on what a moral curriculum would look like. Subsequently, what assessment criteria should be used to measure the curriculum’s effectiveness? Is the curriculum successful if more students vote? Or is it effective if students are able to negotiate conflicts calmly? As we have argued, achievement tests do not always represent adequately the curricula they are supposed to measure. Therefore, it is reasonable to think that appropriate accountability and assessment tools would be difficult to implement for non-subject-matter content—at least without adequate empirical research. Policymakers, however, are not only supporting past emphases of the impact of schools on achievement, but are pressuring for higher and higher standards. In some instances, the standards have been pushed so high (and artificially high) that in Virginia and in Arizona over 90 percent of students failed exams that presumably represented efforts by state government to improve standards. At some level, increasing standards may be a laudable goal; however, developing standards that only 10 percent of our
students pass seems to be a ridiculous and selfdefeating effort. Given the exaggerated, negative views of youth, and testing policies designed to guarantee youth’s failure, it seems important to explore motivation. Are such practices designed to suggest that youth, especially minority youth, do not deserve resources in their schools?
11. Educators on Noncognitie Factors Rothstein (2000) has noted that, in part, achievement measures are reified in making decisions about the effectiveness of schooling, because there are many of these measures, but few ways to measure noncognitive outcomes. He noted that there are 10 core outcomes that motivate citizens to invest in schools so that students can contribute to society and lead productive adult lives. The dimensions he identifies are literacy, competencies in mathematics, science, and technology, problem solving, foreign languages, history knowledge, responsible democratic citizenship, interest in creative arts, sound wellness practices, teamwork and social ethics, and equity (narrowing the gap between minority and white students in the other nine areas). Wanlass (2000) has drawn attention to the numerous problems in contemporary society, including the effects of a global economy, increased ethnic diversity, advanced technology, and increasing risk for various problems such as intolerance (hate crimes), overpopulation, and a dramatic depletion of natural resources. In this context, she argued a need to help youth develop the affective dispositions required for the modern world (leadership, tolerance, philanthropic concern). She suggested several strategies, capitalizing on the unique strengths and talents of more students and the need to better develop the unique abilities of nonacademic areas as well as academic concerns (performing arts, social interaction, leadership, civic-mindedness, social advocacy, etc.). Goodenow (1992) has argued that students who feel more connected to their schools are more likely to be academically and socially successful. Students who felt more personally valued and invested in were more likely to place higher value and have higher expectations for classroom success than were students who did not feel valued. This finding was replicated in a large-scale study (Blum and Rinehart 1996), which surveyed more than 90,000 students nationwide and found that those who felt more connected to their schools were less likely to engage in risky behavior, including: smoking, doing drugs, engaging in sexual activity, and repeated truancy, and they were less likely to drop out of school. In this study, school connectedness was defined as students who: (a) felt they were treated fairly, (b) reported they got along with other teachers and students, and (c) felt close to people at school. Connectedness was also explored in 13587
School Outcomes: Cognitie Function, Achieements, Social Skills, and Values terms of how schools were perceived by their teachers and administrators. A school that encouraged more connectedness exhibited less student prejudice and had higher daily attendance rates, lower drop out rates, and more teachers employed with masters degrees. Although these data are correlational in nature, it suggests that some factors thought to be uncontrollable, may in fact be controllable. Researchers have also investigated the role of nonsubject-matter variables as they influence students’ motivation to achieve. For example, Nichols (1997) examined the nature of the relationship between students’ perceptions of fairness and their reported levels of motivation to achieve. Of the four dimensions of fairness identified through factor analytic techniques (personal, rules, punishment, teacher), a personal sense of fairness more strongly mediated motivational variables than did other dimensions. Similar to Goodenow (1992), this data set indicated that students who felt more personally valued were more likely to be motivated to achieve in school settings. However, even if students’ personal sense of fairness was not linked directly to an achievement motivation, it would seem that this is a desirable outcome of schooling. Students should feel respected and safe, whether or not these variables influence achievement.
12. Conclusion Clearly, more public debate is called for, if more focused judgments are to be reached concerning those aspects of schools that are defined as most critical. However, given the growing evidence that citizens are concerned about non-subject-matter outcomes of schooling, it seems incumbent on policymakers to ‘proxy’ these concerns by collecting data on some of the many outcomes that could be measured. There is a need for research and arguments focused on nonsubject-matter variables, as well. For example, it seems that high subject-matter specialization and excessively high levels of student obedience (as opposed to student initiatives) are not a prescription for forging a professional community well prepared for engaging in creative work and problem solving. A recent poll of teenagers found that the three greatest pressures they experience are to get good grades (44 percent), to get into college (32 percent), and to fit in socially. In contrast, fewer teens report feeling pressure to use drugs or alcohol (19 percent) or to be sexually active (13 percent). What modern youth are concerned about varies markedly from the view presented by the media. Unfortunately, policymakers at present seem more intent on blaming youth and holding them accountable than on understanding them. The collection of reliable data on various non-subject-matter outcomes of schooling might help to assure that youth are developing the pro13588
active skills necessary for life in a democracy (e.g., civility, a willingness to examine ideas critically, and ambition). The historical concern for evidence about subjectmatter growth should continue—particularly if these measures are collected in ways that provide a basis for curriculum improvement. However, it is important to recognize that US public schools are about much more than subject-matter acquisition. More research on students and how they mediate social, emotional, and academic contexts might enable educators to design better school programs that recognize students both as academic learners and as social beings (McCaslin and Good 1992). See also: Curriculum as a Field of Educational Study; Educational Assessment: Major Developments; Educational Evaluation: Overview; Educational Research for Educational Practice; School Effectiveness Research; Teacher Behavior and Student Outcomes
Bibliography Berliner D, Biddle B 1995 The Manufactured Crisis: Myth, Fraud and the Attack on America’s Public Schools. Addison-Wesley, New York Biddle B, Good T, Goodson I 1997 Teachers: The International Handbook of Teachers and Teaching. Kluwer, Dordrecht, The Netherlands, Vols. 1 and 2 Blum R W, Rinehart P M 1996 Reducing the Risk: Connections That Make a Difference in the Lies of Youth. University of Minnesota, Division of General Pediatrics and Adolescent Health, Minneapolis, MN Bracey G W, Resnick M A 1998 Raising the Bar: A School Board Primer on Student Achieement. National School Boards Association, Alexandria, VA Camilli G, Firestone W 1999 Values and state ratings: An examination of the state-by-state indicators in quality counts. Educational Measurement: Issues and Practice 35: 17–25 Farkas S, Johnson J, Duffett A, Bers A 1997 Kids These Days: What Americans Really Think About The Next Generation. Public Agenda, New York Good T L, Braden J 2000 The Great School Debate: Choice, Vouchers, and Charters. Erlbaum, Mahwah, NJ Good T L, Brophy J 2000 Looking in Classrooms, 8th edn. Longman, New York Good T L, Weinstein R 1986 Schools make a difference: Evidence, criticisms, and new directions. American Psychologist 41: 1090–7 Good T L, Grouws D A, Ebmeier H 1983 Actie Mathematics Teaching. Longman, New York Goodenow C 1992 School motivation, engagement, and sense of belonging among urban adolescent students. Paper presented at the Annual Meeting of the American Educational Research Association, San Diego, CA Grouws D, Cebulla K 2000 Elementary and middle school mathematics at the crossroads. In: Good T (ed.) American Education: Yesterday, Today, and Tomorrow. Ninety-Ninth Yearbook of the National Society for the Study of Education. University of Chicago Press, Chicago, Chapter 5, pp. 209–55
Schooling: Impact on Cognitie and Motiational Deelopment McCaslin M, Good T 1992 Compliant cognition: The misalliance of management and instructional goals and current school reform. Educational Researcher 21: 4–17 National Commission for Excellence in Education 1983 A Nation At Risk: The Imperaties for Education Reform. US Department of Education, National Commission for Excellence in Education, Washington, DC National Education Commission on Time and Learning 1994 Prisoners of Time. US Government Printing Office, Washington, DC Nichols S 1997 Students in the Classroom: Engagement and Perceptions of Fairness. Master’s thesis, University of Arizona Randi J, Corno L 1997 Teachers as innovators. In: Biddle B, Good T, Goodson I (eds.) International Handbook of Teachers and Teaching. Kluwer, Dordrecht, The Netherlands, Vol. 2, Chap. 12, pp. 1163–221 Rothstein R 2000 Toward a composite index of school performance. Elementary School Journal 100(5): 409–42 Slavin R 1999 How Title I can become the engine of reform in America’s schools. In: Orfield G, Debray E (eds.) Hard Work for Good Schools: Facts Not Fads In Title I Reform. The Civil Rights Project, Harvard University, Cambridge, MA, Chap. 7, pp. 86–101 Teddlie C, Stringfield S 1993 Schools Make a Difference: Lessons Learned From a 10-Year Study of School Effects. Teachers College Press, New York Wanlass Y 2000 Broadening the concept of learning and school competence. Elementary School Journal 100(5): 513–28 Weinert L, Helmke A 1995 Interclassroom differences in instructional quality and interindividual differences in cognitive development. Educational Psychologist 30: 15–20
T. L. Good and S. L. Nichols
Schooling: Impact on Cognitive and Motivational Development Up until the mid-1970s, neither the general public nor the social sciences could agree on whether schooling has a significant impact on cognitive development, how large this impact is or may be, and which variables are responsible for any potential effects it may have. For example, even in 1975, Good et al. asked skeptically, ‘Do schools or teachers make a difference? No definite answer exists because little research has been directed on the question in a comprehensive way’ (Good et al. 1975, p. 3). Things have changed since then, because the findings from numerous empirical studies have banished any serious doubts concerning whether schooling impacts on both the cognitive and motivational development of students. This article will report the arguments questioning the importance of school for the mental development of children and adolescents and assess their scientific validity; present the empirical evidence on the strong impact of school on cognitive development, while simultaneously showing the limits to its generaliz-
ability; and sketch the findings on the role of school in motivational development. The final section presents some conclusions on the relations between cognitive and motivational development under the influence of school.
1. Scientific Doubts Regarding the Impact of Schooling on Cognitie Deelopment Compared with the impact of social origins and\or stable individual differences in intellectual abilities on cognitive development, that of schooling was long considered marginal. Nonetheless, the Coleman Report (Coleman et al. 1966) was still a great shock to many politicians, educators, and teachers when it concluded, ‘that school brings little influence to bear on a child’s achievement that is independent of his background and general social context’ (p. 325). Just as radically, Jencks et al. (1972) confirmed ‘that the character of schools’ output depends largely on a single input, namely, the characteristics of the entering children. Everything else—the schools’ ‘budget, its policies, the characteristics of the teachers—is either secondary or completely irrelevant’ (p. 256). According to their findings, elementary school contributed 3 percent or less to the variance in cognitive development, and secondary school 1 percent or less. Even though both Coleman et al.’s (1966) and Jencks et al.’s (1972) general statements were modified and differentiated slightly in comparisons between private and public schools, between middle-class and lower-class children, and between the effects of different school variables, the final conclusion remains unchanged: ‘Nevertheless, the overall effect of elementary school quality on test scores appears rather modest’ (Jencks et al. 1972, p. 91). Viewed from the perspective of the beginning of the twenty-first century, such statements and conclusions are, without exception, underestimations of the impact of schooling on cognitive development in childhood and adolescence. There are various reasons for these underestimations. First, two aspects of cognitive development that should be kept strictly separate are frequently confounded: the growth and acquisition of competencies, knowledge, and skills on the one side compared with the change in interindividual differences in cognitive abilities on the other. We now know that schools are necessary and very influential institutions for the acquisition of that declarative and procedural knowledge that cannot be learned in the child’s environment outside school, but that the quantity and quality of schooling has only a limited power to modify individual differences in abilities. Second, Coleman et al. (1966), Jencks et al. (1972), and many other studies, compared very similar schools in the USA or other industrialized countries. Naturally, such schools reveal more commonalities than 13589
Schooling: Impact on Cognitie and Motiational Deelopment differences due to the similarities in teacher training, curricular goals, educational traditions, the budget of the educational institutions, the state-regulated quantity of teaching, and standardized achievement tests. The methodological problems in such research did not become evident until the publication of findings from cross-cultural studies. These showed that when schools in the Third World—or even achievement measured in ethnic groups with no or very low school education—are taken into account alongside schools in industrialized countries, massive effects of the quantity and quality of instruction can be ascertained (Anderson et al. 1977). The third reason for underestimating the impact of schools was that scientific interest in educational sociology focused on the role of social class and the family in mental development, whereas developmental psychology was dominated by Jean Piaget’s constructivist model. Whereas theories of cognitive development deal with general mechanisms of thought, action, and experience, models of school learning in the strict sense are concerned with the acquisition of specific skills and small bits of knowledge. Piaget and his co-workers emphasized ‘that this form of learning is subordinate to the laws of development and development does not consist in the successive accumulation of bits of learning since development follows structuration laws that are both logical and biological’ (Inhelder and Sinclair 1969, p. 21).
2. Empirical Confirmation of the Impact of Schooling on Cognitie Deelopment In many social scientists, the skeptical attitude toward the role of schools in recent decades has been overcome to a large extent (although not completely) by the results of numerous empirical studies. This applies not only to the acquisition of specific cognitive competencies but also to the promotion of intelligence. After reviewing the relevant literature, Rutter (1983, p. 20) came to the conclusion that ‘the crucial component of effective teaching includes a clear focus on academic goals, an appropriate degree of structure, an emphasis on active instruction, a task-focused approach, and high achievement expectations.’ Although recent years have seen increasingly strong criticism of Rutter’s description of the successful classroom, it is broadly confirmed empirically that active teachers, direct instruction, and effective classroom management on the one side accompanied by active, constructive, and purposeful learners on the other side are necessary conditions for the effective acquisition of competencies, knowledge, and skills. Naturally, a major role can also be assigned to intrinsically motivated, self-organized, and cooperative learning (Weinert and Helmke 1995). Nonetheless, the impact of schools on cognitive development ranges far beyond teaching declarative 13590
and procedural knowledge and also includes the promotion of intellectual abilities. For example, Ceci (1991) summarized his state-of-the-art review as follows: Of course, schooling is not the complete story in the formation and maintenance of IQ-scores and IQ-related cognitive processes … Children differ in IQ and cognitive processes prior to entering school, and within a given classroom there are sizeable individual differences despite an equivalence of schooling. … Thus, the conclusion seems fairly clear: Even though many factors are responsible for individual and group differences in the intellectual development of children, schooling emerges as an extremely important source of variance, notwithstanding historical and contemporary claims for the contrary. (p. 719)
Looking at the available data from industrialized countries, the Third World, and, above all, crosscultural studies on the impact of schooling on cognitive development in general and the effectiveness of specific features of schools in particular, substantial effects can be ascertained on intellectual abilities; on metacognitive strategies; on the acquisition of verbal, mathematical, scientific, logical, and technological competencies; and on various forms of domainspecific and cross-domain knowledge. These findings provide unequivocal support for Geary’s (1995) hypothesis that schools are necessary cultural conditions for the acquisition of those abilities and skills that cannot be learned in the child’s immediate environment: this means almost all the cognitive competencies necessary for a good life and a successful career in our modern scientifically and technologically shaped world. Up to which level cognitive development is enhanced and which competencies are acquired depends strongly on the quality and quantity of schooling. This is confirmed by findings from a number of large-scale international studies that not only compared school achievement but also assessed important features and variables in the individual national school systems. A current example is the Third International Mathematics and Science Study (TIMSS) carried out by the International Association for the Evaluation of Educational Achievement (IEA; see Baumert et al. 1997). In association with numerous national and regional studies on educational productivity, school effectiveness, or school improvement, it has been possible to identify a variety of combinations and configurations of variables in the school system, the individual school, the classroom, the teacher, instruction, and the school context that impact on various dimensions and aspects of the cognitive development of students. At the same time, these findings have also led to scientific suggestions for improving school systems. Alongside mean effect sizes, the long-term effects of exceptional schools and teachers are of particular interest. For example, Pederson et al. (1978) reported
Schooling: Impact on Cognitie and Motiational Deelopment the strong long-term effects of one excellent teacher in a single-case study. Compared with the students of other teachers, the children she had taught in their first two years at school were significantly more successful in their later academic careers as well as in their later jobs. More detailed analyses of this effect revealed that this teacher attained above-average effects on school performance in the first two school grades as well as on her students’ attitudes toward work and their initiative, but no great increase in intellectual abilities. Even more important than the direct teacher effects were the indirect effects of a good, successful, and positively experienced start to school on all the children’s further cognitive and motivational development. Although it has to be said that this is only one single-case study, its findings could be replicated as a trend, though not so emphatically, in several representative longitudinal studies (Weinert and Helmke 1997). However, great methodological care is necessary when gathering, analyzing, and interpreting longitudinal data. Only multilevel analyses (that view teacher or instruction variables as being moderated by classroom contexts, and classroom effects as being influenced by features of the school, the school system, and the sociocultural environment) permit valid estimations of the effects of certain clusters of variables on students’ cognitive development and growth in knowledge. Such studies have shown that individual variables make only a very limited contribution to explaining the variance in student outcomes. There are many reasons for this: one of the most important is the complex ‘multideterminedness’ of students’ academic achievement. Haertel et al. (1983) characterize the ensuing theoretical and methodological problems as follows: … classroom learning is a multiplicative diminishing-returns function of four essential factors—student ability and motivation, and quality and quantity of instruction … Each of these essential factors appear to be necessary but insufficient by itself to classroom learning; that is, all four of these factors appear to be required at least at minimum levels for the classroom learning to take place. It also appears that the essential factors may substitute, compensate, or trade-off for one another in diminishing rates of return. (p. 75)
3. The Limited Impact of School on the Modification of Indiidual Differences in Cognitie Abilities and Competencies Of the four essential conditions for classroom learning identified by Haertel et al. (1983), only two (the quantity and quality of instruction) are under the control of the school and the teacher. The other two (ability and motivation) are features of students.
Already when entering school, children exhibit large interindividual differences in both their cognitive competencies and their motivational tendencies. Through genetic factors and through covariations between genotype and environmental influences, interindividual differences in intellectual abilities already become moderately stable early in childhood. Round about the seventh year of life, astonishingly high stability coefficients of 0.5 to 0.7 are already found, despite the relatively low reliability of many tests at this age level. Assuming that more intelligent students learn, on average, more quickly, more easily, and better than less intelligent students under the same conditions of instruction, it can be anticipated that interindividual differences in mental abilities and cognitive development will not diminish under the influence of school but become increasingly more stable. This is also supported by the results of many empirical studies: For example, during the course of elementary schooling, stability coefficients in verbal intelligence tests increase to about 0.8, nonverbal intelligence to 0.7, mathematical competencies to 0.7, reading and writing to 0.7, and the development of positive attitudes toward learning, self-confidence, and test anxiety to 0.6 in each case (Weinert and Helmke 1997). This recognizable trend in elementary school continues in secondary school, with stabilities in IQ scores already attaining values between 0.8 and 0.9 from the age of 12 years onward, thus permitting almost perfect long-term prediction. These and other empirical findings can be used to draw some theoretical conclusions on classroom learning that may be modified, but not falsified, by variations in school contexts: (a) Cognitive abilities and competencies develop dramatically in childhood and adolescence—not least under the influence of the school. At the same time, interindividual differences already stabilize at a relatively early age and then remain more or less invariant. (b) This trend applies not only for intellectual abilities but also for the acquisition of cognitive competencies and domain-specific knowledge. This is particularly apparent when the knowledge acquired is cognitively demanding. In this case, it is necessary to assume a relationship between the level of intelligence and the quality of the knowledge (‘intelligent knowledge’). (c) Naturally, the stabilities in individual differences are in no way perfect, so that changes can be observed in students’ relative positions in the ranking of abilities and academic achievements. However, very strong changes are often an exception due to idiosyncrasies rather than the rule. (d) All attempts to level out these individual differences in cognitive abilities and competencies that stabilize during the course of schooling and to make achievements and achievement dispositions equal for all students have generally been unsuccessful. The same applies to the concept of mastery learning when 13591
Schooling: Impact on Cognitie and Motiational Deelopment practiced over a longer period of time and on demanding tasks. (e) Of course, very significant changes in individual differences result when students are taught for different lengths of time and\or with different achievement aspirations in various subjects or the same subject. Whereas, for example, some students broaden their basic knowledge of the world only slightly through the physics they learn at school, others acquire a high degree of expertise in the subject. Despite the doubtful validity of the claim put forward in the novice–expert research paradigm that intellectual abilities and talents are irrelevant for the acquisition of expertise, the extent of deliberate practice seems to be decisive for the achievement of excellence. (f) Regardless of the level of individual abilities and regardless of the stability of interindividual differences in ability, it is still necessary for all students to acquire all their cognitive competencies through active learning. Many studies have shown that the quantity and quality of instruction are useful and, in part, even necessary for this.
4. The Impact of Schooling on Motiational Deelopment The available statistical meta-analyses confirm the belief shared by most laypersons that student motivation is an essential personal factor for successful learning at school. However, motivational tendencies and preferences are not just conditions of learning but are also influenced by learning and teaching in the classroom. Their systematic promotion is simultaneously an important goal of school education. To analyze motivational conditions and consequences of school learning, it is necessary to distinguish strictly between dispositional motives and current motivation. Whereas motives (e.g., achievement motive, affiliation motive, attribution style, test anxiety, self-confidence) are understood as personal traits that are relatively stable and change only slowly, motivation concerns current approach or avoidance tendencies that not only depend on dispositional motives but are also influenced strongly by situational conditions (stimuli, incentives, rewards, threats, etc.). Naturally, the long-term modification of motivational development through the school predominantly involves the students’ dispositional motives, attitudes, and value orientations. School enrollment between the fifth and seventh year of life seems to be a sensitive period of development for this. The cognitive competencies acquired during preschool years enable first graders to compare themselves with others, to make causal interpretations over the differences and changes in achievement they register, and to form anticipations regarding future achievements. At the same time, the school class offers a variety of opportunities for 13592
comparisons on the levels of behavior, achievement, evaluation, and ability. Teachers play an important role in this. Almost all preschool-age children are optimists: They look forward to attending school, most anticipate that they will do well, they have only a low fear of failure, and they believe that they will be able to cope with all the demands of the school. This morethan- optimistic attitude changes during the first two years at school. In general, children become realists, although most of them continue to hold positive expectations regarding their own success. This basic attitude stabilizes during the further course of schooling—naturally, with large interindividual differences due, in part, to personal academic successes and failures, in part to a stable anticipation bias (Weinert and Helmke 1997). This general developmental trend is modified more or less strongly by specific features of the school, the classroom, the teacher, and the methods of instruction (for overview see Pintrich and Schunk 1996). A few examples of this will be given below. These come either from studies in which the motivational effects of teaching and learning were observed in the classroom, or from studies assessing the effects of planned interventions. Personal causation (the feeling of being an origin and not a pawn) is an important motivational condition for successful classroom learning. If students are to experience personal causation, they must have opportunities to (a) set themselves demanding but realistic goals, (b) recognize their own strengths and weaknesses, (c) be self-confident about the efficacy of their own actions, (d) evaluate whether they have attained the goals they have set themselves, and (e) assume responsibility not only for their own behavior but also, for example, that of their classroom peers. Psychological interventions designed to enhance the ‘origin atmosphere’ in the classroom and increase the ‘origin experiences’ of the students have led not only to the intended motivational changes but also to broad improvements in average academic achievement. Similarly positive outcomes are reported from studies designed to improve the goal orientation of learning, to strengthen achievement motivation, and to modify the individual patterns of attributions used in intuitive explanations of success and failure. For example, efforts have been made in both classrooms and individual students to modify the typical attribution style for learned helplessness (attributing failure to stable deficits in ability and success to luck or low task difficulty) so that given abilities and varying effort as a function of task difficulty become the dominant personal attribution patterns. Positive modifications to student motivational development depend decisively on the teacher’s attitudes, instructional strategies, and feedback behavior. As a result, many intervention programs for enhancing motivation focus on both students and teachers
Schooling: Impact on Cognitie and Motiational Deelopment simultaneously. For example, a large study of ‘selfefficacious schools’ is being carried out at the present time in Germany. It is testing the assumption that enhanced self-efficacy beliefs may affect not only teacher behavior (by reducing burnout syndromes and enhancing a proactive attitude) but also student learning (by increasing motivation and achievement). For several decades, theories of learning motivation have been dominated by the assumption that basic needs are satisfied through achievement, social support, rewards, self-evaluation, and so on. Frequently the motivational mechanisms were those underlying expectancyivalue models, cost-benefit calculations, or instrumental behavior models. In comparison, intrinsic motives for learning have played a much smaller role. Intrinsic motivation addresses needs and goals that are satisfied by the process of learning itself and by the possibility of experiencing and enjoying one’s own expression of competence and increasing knowledge in a preferred content domain. In this theoretical framework, working and learning are not a means to an end, but are more or less an end in themselves; learning activities and learning goals are not clearly distinguished (e.g., exploratory drive, experience of flow, effectancy motivation, etc.). Underlying these concepts is the assumption ‘… that intrinsic motivation is based in the innate, organismic needs for competence and self-determination. It energizes a high variety of behaviors and psychological processes for which the primary rewards are the experience of effectance and autonomy’ (Deci and Ryan 1985, p. 32). The reason for a careful consideration of intrinsic motivation is that in educational philosophy intrinsic motivated learning and behavior are regarded as the most important goals of schooling. The worry is that strengthening extrinsic motivation by making external incentives and rewards available is likely to be detrimental to achieving this goal. The ideas underlying such fears come from the results of some studies investigating the ‘overjustification effect,’ which show that the use of rewards can undermine intrinsic motivation. However, the results of several recent studies indicate that this fear is unjustified. Furthermore, the relation between dispositional motives, actual motivation, and learning activities is not simple and linear but is influenced strongly by interconnections with cognitive processes, metacognitive strategies, and volitional skills.
5.
Conclusions
Schools are cultural and educational institutions for promoting cognitive development and imparting those competencies that the child cannot acquire within the immediate environment. Throughout the world, schools fulfil this evolutionary, psychological, and cultural function more or less well. The quantity and quality of instruction are a major source of the
variance in the genesis of interindividual differences in cognitive competencies. Moreover, the development and promotion of competent, self-regulated learning is an equally important goal of schools in modern societies. The development of intrinsic and extrinsic motives becomes particularly important here in association with metacognitive and volitional competencies. This is why recent decades have seen not only studies examining which relations can be found between the classroom atmosphere, the teacher’s behavior, and the subjective experiences of the students on the one side and their motivational and cognitive development on the other side, but also numerous programs designed to promote motivation systematically that have been tried out with, in part, notable success. The relation between cognitive and motivational development and their interaction are of particular interest in both theory and practice. This is just as true for the influence of academic success on the development of a positive self-concept as for the importance of the self-concept for individual progress in learning (Weinert and Helmke 1997). See also: School Effectiveness Research; School Outcomes: Cognitive Function, Achievements, Social Skills, and Values
Bibliography Anderson R C, Spiro R J, Montague W E (eds.) 1977 Schooling and the Acquisition of Knowledge. Erlbaum, Hillsdale, NJ Baumert J, Lehmann R et al. 1997 TIMSS – MathematischnaturwissenschaftlicherUnterricht im internationalen Vergleich: Deskriptie Befunde. Leske and Budrich, Opladen, Germany Ceci S J 1991 How much does schooling influence general intelligence and its cognitive components? A reassessment of evidence. Deelopmental Psychology 27: 703–22 Coleman J S, Campbell E R, Hobson C J, Mc Partland J, Mood A, Weingold F D, York R L 1966 Equality of Educational Opportunity. Government Printing Office, Washington, DC Deci E, Ryan R M 1985 Intrinsic Motiation and Self-determination in Human Behaior. Plenum Press, New York Geary D C 1995 Reflections on evolution and culture in children’s cognition. American Psychologist 50: 24–36 Good T L, Biddle B J, Brophy J E 1975 Teachers Make a Difference. Holt, Rinehart, & Winston, New York Haertel G D, Walberg H J, Weinstein T 1983 Psychological models of educational performance. A theoretical synthesis of constructs. Reiew of Educational Research 53: 75–91 Inhelder B, Sinclair H 1969 Learning cognitive structures. In: Mussen P H, Langer J, Covington M (eds.) Trends and Issues in Deelopmental Psychology. Holt, Rinehart, & Winston, New York, pp. 2–21 Jencks C, Smith M, Acland H, Bane M J, Cohen D, Gintis H, Heyns B, Michelson S 1972 Inequality. Basic Books, New York Pederson E, Faucher T A, Eaton W W 1978 A new perspective on the effects of first grade teachers on children’s subsequent adult status. Harard Educational Reiew 48: 1–31
13593
Schooling: Impact on Cognitie and Motiational Deelopment Pintrich P R, Schunk D 1996 Motiation in Education: Theory, Research, and Applications. Prentice Hall, Englewood Cliffs, NJ Rutter M 1983 School effects on pupil progress: Research findings and policy implications. Child Deelopment 54: 1–29 Weinert F E, Helmke A 1995 Learning from wise mother nature or big brother instructor: The wrong choice as seen from an educational perspective. Educational Psychologist 30: 135–42 Weinert F E, Helmke A (eds.) 1997 Entwicklung im Grundschulalter. Beltz, Weinheim, Germany
F. E. Weinert
Schools, Micropolitics of ‘Micropolitics’ is a specific perspective in organization theory. It focuses on ‘those activities taken within organizations to acquire, develop, and use power and other resources to obtain one’s preferred outcomes in a situation in which there is uncertainty or dissent’ (Pfeffer 1981, p. 7) because it considers them as most important for the constitution and the workings of organizations. This article will explore the uses and usefulness of this concept in educational research. It starts by outlining the micropolitical view on educational organizations. Then, it gives some examples of representative micropolitical research in order to illustrate typical topics and research strategies. Finally, it discusses criticisms of micropolitics and indicates areas for development.
1. The Concept of Micropolitics 1.1 Main Elements Traditional organizational theories of different origins seem to converge in describing organizations as goal oriented and rationally planned, characterized by stable objective structures, a high degree of integration, and a common purpose of its members. Conflicts between the members of an organization are considered costly and irrational ‘pathologies’ to be eradicated as soon as possible. However, conflicts are endemic in organizational life (Ball 1987). Different stakeholders pursue their own interests which are not necessarily identical with the formulated goals. Coalitions are formed, meetings are boycotted, and formal structures are ignored. As such phenomena frequently occur in organizations, it is not helpful to exclude them from organizational theorizing: models which can cope with this ‘dark side of organizational life’ (Hoyle 1982, p. 87) are needed. The micropolitical perspective was originally developed in the realm of profit organizations (e.g., 13594
Bacharach and Lawler 1980, Pfeffer 1981). In a 1982 seminal paper, Eric Hoyle (1982) put forward arguments for exploring the concept also for education. In 1987 Stephen Ball (1987) published the book The Micro-Politics of the School in which he used empirical material to push the development of the concept. In its wake the term ‘micropolitics’ appeared more frequently in various research papers (see Blase 1991 for a representative collection). What are the main elements of a micropolitical view of schools as presented in this early research? (a) The micropolitical perspective is based on a specific view of organizations. They are seen as characterized by diverse goals, by interaction and relationships rather than structures, by diffuse borders and unclear areas of influence rather than clear-cut conditions of superordination and delegation, by continuous, unsystematic, reactive change rather than by longer phases of stable performance and limited projects of development. The main focus of micropolitical approaches is not the organization as it aspires to be in mission statements or organizational charts, but the organization-in-action and, in particular, the ‘space between structures’ (Hoyle 1982, p. 88) which produces enough ambiguity to allow political activities to flourish. (b) A micropolitical approach is based on a specific image of actors. The members of organizations are seen as pursuing their own interests in their daily work. The actors may do this as individuals, they may coalesce in loosely associated interest sets, or use subdivisions of the organization (e.g., departments, professionals versus administration) as power bases. In order to protect or enhance their organizational room for maneuver, they aim to retain or obtain control of resources, such as the following (see Kelchtermans and Vandenberghe 1996, p. 7, Ball 1987, p. 16): (i) material resources, such as time, funds, teaching materials, infrastructure, time tabling, etc. (ii) organizational resources such as procedures, roles, and positions which enable and legitimate specific actions and decisions. They define ‘prohibited territories’ and relationships of subordination or autonomy. Thereby, they have their bearing on the actors’ chances to obtain other resources (e.g., through career, participation, etc.). (iii) normative or ideological resources, such as values, philosophical commitments, and educational preferences. A particularly important ideological resource is the ‘definition of the organization’ which indicates what is legitimate and important in the organizational arena. (iv) informational resources: the members of an organization are interested in obtaining organizationally relevant information and to have their expertise, knowledge, and experience acknowledged. (v) social resources: affiliation to, support from, and reputation with influential groups within and outside
Schools, Micropolitics of the school are assets which may be mobilized for one’s organizational standing. (vi) personal resources: also personal characteristics—such as being respected as a person, being grounded in an unchallenged identity as a teacher, etc.—are resources in organizational interactions. (c) A micropolitical perspective pays special attention to interaction processes in organizations. These are interpreted as strategic and conflictual struggle about the shape of the organization which, in consequence, defines the members’ room for maneuver. This has the following implications: (i) Organizational interaction is power-impregnated. It draws on the resources mentioned above for influencing the workings of an organization. However, these resources have to be mobilized in interaction in order to become ‘power.’ (ii) Power is in need of relationship. In order to satisfy his or her interests an actor needs some interaction from other actors. Power builds on dependency, which makes the relationship reciprocal, but, as a rule, asymmetric. (iii) Power is employed from both ends of the organizational hierarchy. There are advantages of position, however, ‘micropolitical skills may be well distributed throughout the organization and position is not always an accurate indicator of influence’ (Ball 1994, p. 3824). (iv) Actors use a multitude of strategies and tactics to interactively mobilize their resources and to prevent others from doing so, such as, for example, setting the agenda of meetings, boycotting and escalating, avoiding visibility, confronting openly or complying, carefully orchestrating scenes behind closed doors or in public arenas, etc. (v) Organizational interaction is seen as conflictual and competitive. ‘Confronted by competition for scarce resources and with ideologies, interests and personalities at variance, bargaining becomes crucial’ (Gronn 1986, p. 45). The relationship between control and conflict (or domination and resistance) is conceived ‘as the fundamental and contradictory base of organizational life’ (Ball 1994, p. 3822). (vi) Organizational change is nonteleological, pervasive, and value-laden since it advances the position of certain groups at the expense of the relative status of others (see Ball 1987, p. 32). Schools are not organizations easily changed; a ‘quick-fix orientation’ in innovation will not be appropriate to their cultural complexities (see Corbett et al. 1987, p. 57).
2. Research in the Micropolitics of Education 2.1 Methods and Strategies If part of the micropolitical activities which are interesting for the researcher take place in privacy, if
these processes are often overlain by routine activities (see Ball 1994, p. 3822), and if experienced actors regularly use strategies such as covering up actions, deceiving about real intentions etc., then researching the micropolitics of schools will have to cope with some methodological problems. To come to grips with their object of study micropolitical studies focus on critical incidents, persons and phases that disrupt routine, for example, the appointment of a new principal, the introduction of new tasks or programs, instances of structural reorganization (e.g., a school merger), etc. They pay special attention to the processes of (re-)distribution of resources, rewards, and benefits (see Ball 1994, p. 3823) or analyze instances of organizational conflict to find out how new organizational order has been established. These strategies provide a quicker access to indicative processes, but are certainly in danger of inducing an overestimation of the proportion of conflict in organizations. Micropolitical studies try to enhance their credibility through triangulation of methods, collection of multiple perspectives, longterm engagement in the field, critical discourse in a team of researchers, and feedback from participants and external researchers. Although there have been some attempts to tackle organizational micropolitics with quantitative means (see Blickle 1995), most studies are based on qualitative methods. There have been interview and observation studies, however, the typical strategy is a case study approach mainly based on interviews, field notes from participant or nonparticipant observation, transcripts of meetings, analyses of documents, and cross-case comparison of different sites. A good example of such a case study is Sparkes’ (1990) three-year study of a newly appointed departmental head’s attempts at innovation: the head seeks to orientate Physical Education teaching towards a child-centered approach emphasizing mixedability grouping and individual self-paced activities which run counter to the then prevalent ‘sporting ideology.’ In the beginning the new head believes in rational change and democratic participation. The head tries to make other department members understand his own particular teaching approach before introducing changes to the curriculum. However, not all staff can be convinced through dialogue. In his management position the departmental head can control the agenda of department meetings, time, and procedure of arriving at decisions, etc. He secures the agreement of crucial staff members before the meetings and networks with other influential persons outside his department. His proposals for innovation are nevertheless met with resistance by some staff. However, this resistance cannot prevent the structural changes being introduced. However, these structural changes cannot guarantee teachers’ conformity to the head’s educational goals within the confines of individual classrooms. Thus, he goes on to consolidate 13595
Schools, Micropolitics of his domination of the department by careful selection of new staff and exclusion of the unwilling. Although being successful in that his proposals for curriculum change were implemented within one year, the departmental head ended up in a situation that was characterized by conflict and uncertainty and, thus, in many respects, counteracted his own aspirations of democratic leadership. Sparkes suggests that his findings have some bearing beyond the specific case. He warns that the ‘ideology of school-centered innovation … fails to acknowledge the presence and importance of conflict and struggle both within, and between, departments … the commonly held view of teacher participation in the innovative process as ‘good’ in itself is highly questionable. The findings presented indicate that qualitative differences exist in the forms of participation available to those involved and that, to a large extent, these are intimately linked to the differential access that individuals have to a range of power resources’ (Sparkes 1990, p. 177).
2.2 Research Topics Typical topics of micropolitical studies are the following: Processes of change: when organizational ‘routine games’ are disrupted e.g., through educational, curricular, or organizational innovations, through the recruitment of new principals or teachers, through the introduction of quality assurance systems, then intensified processes of micropolitical relevance are to be expected. Some studies investigate how externally mandated reform is dealt with in schools: when Ball and Bowe (1991, p. 23) studied the impact of the 1988 British Educational Reform Act, they found that ‘the implementation of externally initiated changes is mediated by the established culture and history of the institution, and that such changes, ‘‘their acceptance and their implementation become sites as well as the stake of internal dispute.’’ The reform puts some possible alternative definitions about the meaning and the workings of schools on the agenda which have to be ‘processed’ in the local environments through disputes over topics such as, for example, ‘serving a community versus marketing a product’ or ‘priority to managerial or professional views of schooling.’ Leadership and the relationships in the organization: traditionally, educational ethnography had a strong classroom orientation. Micropolitics may be understood as a move to apply ethnographic methods to study the interaction of the members of the school outside the classroom. Although parents’, pupils’ and ancillary staff’s voices are included in some studies, most research focuses on the interaction of teachers. Since steering and regulation of schoolwork are at the heart of the micropolitical approach, it is no wonder that issues of leadership have been frequently investigated (see Ball 1987, p. 80). Holders of senior positions 13596
must acquire micropolitical skills for their career advancement and they ‘have much to lose if organizational control is wrested from them’ (Ball 1994, p. 3824) which makes them perfect foci for study. While many of these studies concentrated on sketching vivid images of a ‘politics of subordination,’ Blase and Anderson (1995) developed a framework for a ‘micropolitics of empowerment’ through facilitative and democratic leadership. Socialization and professional deelopment of teachers: the politically impregnated organizational climate in schools is the context in which individual teachers develop their professional identity (Kelchtermans and Vandenberghe 1996). During the first years of their career teachers concentrate on acquiring knowledge and competencies with respect to academic and control aspects of classroom instruction. Later on they build up a ‘diplomatic political perspective,’ because they feel under ‘constant scrutiny’ by pupils, parents, fellow teachers, and administrators (see Blase 1991, p. 189, Kelchtermans and Vandenberghe 1996). A pervasive feeling of ‘vulnerability’ induces many teachers to develop a range of protective strategies such as low risk grading, documentation of assessment and instruction, avoidance of risky topics and extracurricular activities, etc. (see Blase 1991, p. 193). Thus, it is no wonder that some authors claim that the acquisition of political competence is fundamental to teachers’ career satisfaction (see Ball 1994, p. 3824). Since micropolitical perceptions and competencies are almost absent in beginning teachers’ interpretive frameworks, attention to the induction phase is important (Kelchtermans and Vandenberghe 1996, p. 12). Understanding schools as organizations: since micropolitics is an approach to organization theory all these studies aim to contribute to our understanding of schools as organizations. There are many studies which prove the intensity of power-impregnated activity in schools and analyze typical causes and process forms. It has been argued that political control strategies are a characteristic part of school life ‘precisely because the professional norms of teacher autonomy limit the use of the effectiveness of more overt forms of control’ (Ball 1994, p. 3824). In Iannaconne’s (1991) view, the unique contribution a micropolitical approach can offer lies in exploiting the metaphor of ‘society’ for schools: a typical school is organized like a ‘caste society’ since a majority of its members are ‘subject to its laws without the right to share in making them’ (Iannaconne 1991, p. 469). From the characteristics of a ‘caste society’ some hypotheses about schools may be derived, e.g.: (a) Such societies are characterized by great tensions between the castes and a fragile balance of power. (b) They tend to conceal the internal differences of a caste from others. Consequently, critical self-examination of castes and open debate over policy issues is difficult.
Schools, Micropolitics of (c) They tend to avoid fundamental conflicts and to displace them into petty value conflicts. ‘In schools, as in small rural towns, the etiquette of gossip characterizes teacher talk. Faculty meetings wrangle over trivial matters, avoiding philosophic and ethical issues like the plague’ (Iannaconne 1991, p. 469).
3. Criticism and Areas for Deelopment The first wave of research and conceptual work in micropolitics at the beginning of the 1990s has triggered some criticism which will be discussed in order to identify possible limitations, but also areas for development. (a) Many authors of the micropolitical approach revel in examples of illegitimate strategies: secrecy, deception, and crime add much to the thrill of micropolitical studies. Some critics argue that talking about the pervasiveness of ‘illegitimate power’ produces itself a ‘micropoliticization’ of practice and will undermine confidence in organizations. This claim implies that a second meaning of ‘micropolitics’ is introduced: it does not (only) refer to an approach in organization theory (which analyses both legitimate and illegitimate interactions and their effects for the structuration of an organization), but refers to those specific activities and attitudes in organizations which are based on power politics with illegitimate means. This conceptual move, however, produces problems: how should it be possible to draw the line between ‘good’ and ‘bad organizational politics’ since it were precisely situations characterized by diverse goals and contested expertise which made a concept like ‘micropolitics’ necessary? (b) Although the proponents of micropolitics are right to criticize a presupposition of organizational consensus, it is unsatisfactory to interpret any consensus as a form of domination (see Ball 1987, p. 278, Ball 1994, p. 3822). Many early studies did not include cooperative or consensual interactions (see Blase 1991, p. 9). It will be necessary to develop conceptual devices to account for relationships of co-agency which neither dissolve any cooperation into subtle forms of domination nor gloss over the traces of power which are at the core of collaboration and support (see Altrichter and Salzgeber 2000, p. 104). (c) In focusing on strategic aspects of action micropolitical approaches are again in danger of overemphasizing the ‘rationality’ of organizational action. To understand the workings of organizations it may be necessary to consider actions which are mainly influenced by emotions and affect, which have not been carefully planned according to some interests, as well as unintended consequences of otherwise motivated actions. (d) Although being right to put the spotlight on the
potential instability of organizational processes, early micropolitical studies had some difficulty in conceptualizing the relative stability and durability of organizations, which we can observe every day in many organizations. For this purpose, it will be necessary to bring agency and structure in a balanced conceptual relationship: micropolitics will not deal with ‘relationships rather than structures’ (Ball 1994, p. 3822) but precisely with the interplay between interaction and structure. (e) The understanding of politics displayed in many early micropolitical studies was quite unsatisfactory, not to say unpolitical. Politics was often seen as the opposite of truth and reason; it was associated with treason, conspiracy and accumulation of influence as if fair and democratic ways of negotiation in organizations were inconceivable. More recent studies, for example, Blase and Anderson (1995), seem to overcome this bias. It was also criticized that in some micropolitical studies the external relationships of the organization were dissolved into the interactions between individual actors. However it must be said that some studies took care to account for the societal embedding of organizational interactions (see Ball and Bowe 1991). In the meantime there is general agreement that this must be a feature of sound micropolitical studies (see Blase 1991, p. 237). Problems (c) and (e) may be overcome by reference to Giddens’ (1992) theory of structuration, to Crozier and Friedberg’s (1993) concept of organizational games, and to Ortmann et al.’s (1990) studies of power in organizational innovation. On this conceptual basis, organizational structure may be seen as an outcome of and resource for social actions which enables and restricts actions, but never fully determines them. Organizations are webs woven from concrete interactions of (self-)interested actors. They acquire some spatial-temporal extension through the actors’ use and reproduction of ‘games.’ The ‘game’ is the point of conceptual intersection in which the analytically divided elements of agency and structure, of individual and organization meet. The game works as an ‘indirect integration mechanism which relates the diverging and\or contradictory action of relatively autonomous actors’ (Crozier and Friedberg 1993, p. 4). In order to take action and to pursue their own interests members of organizations must at least partly use structural resources of the organization i.e., they are forced to play within the organizational rules, and thereby, they reproduce these rules. The functioning of an organization may be explained as ‘result of a series of games articulated among themselves, whose formal and informal rules indirectly integrate the organization members’ contradictory power strategies’ (Ortmann et al. 1990, p. 56). The interlocked character of the games and the high amount of routine elements in social practices are major reasons for the relative stability of organizations (see Altrichter and Salzgeber 2000, p. 106). 13597
Schools, Micropolitics of See also: Educational Innovation, Management of; Educational Leadership; Group Processes in Organizations; Leadership in Organizations, Psychology of; Organization: Overview; Organizational Decision Making; School as a Social System; School Improvement; School Management
Bibliography Altrichter H, Salzgeber S 2000 Some elements of a micropolitical theory of school development. In: Altrichter H, Elliott J (eds.) Images of Educational Change. Open University Press, Buckingham, UK, pp. 99–110 Bacharach S, Lawler E 1980 Power and Politics in Organizations. Jossey-Bass, San Francisco, CA Ball S J 1987 The Micro-Politics of the School. Routledge, London Ball S J 1994 Micropolitics of schools. In: Husen T, Postlethwaite T N (eds.) The International Encyclopedia of Education, 2nd edn. Pergamon, Oxford, UK, pp. 3821–6 Ball S J, Bowe R 1991 Micropolitics of radical change. Budgets, management, and control in British schools. In: Blase J (ed.) The Politics of Life in Schools. Sage, Newbury Park, CA, pp. 19–45 Blase J (ed.) 1991 The Politics of Life in Schools. Sage, Newbury Park, CA Blase J, Anderson G L 1995 The Micropolitics of Educational Leadership. Cassell, London Blickle G 1995 Wie beeinflussen Personen erfolgreich Vorgesetzte, KollegInnen und Untergebene? [How do persons successfully influence superiors, colleagues, and subordinates?] Diagnostica 41: 245–60 Corbett H D, Firestone W A, Rossman G B 1987 Resistance to planned change and the sacred in school cultures. Education Administration Quarterly 23: 36–59 Crozier M, Friedberg E 1993 Die Zwaenge kollektien Handelns. Hain, Frankfurt\M., Germany [1980 Actors and Systems. University of Chicago Press, Chicago, IL] Giddens A 1992 Die Konstitution der Gesellschaft. Campus, Frankfurt\M., Germany [1984 The Constitution of Society. Polity Press, Cambridge, UK] Gronn P 1986 Politics, power and the management of schools. In: Hoyle E, McMahon A (eds.) World Yearbook of Education. Kogan Page, London Hoyle E 1982 Micropolitics of educational organizations. Educational Management and Administration 10: 87–98 Iannaconne L 1991 Micropolitics of education–what and why. Education and Urban Society 23: 465–71 Kelchtermans G, Vandenberghe R 1996 Becoming political: a dimension in teachers’ professional development. Paper presented at the AREA-conference. New York (ERIC-document: ED 395-921) Ortmann G, Windeler A, Becker A, Schulz H-J 1990 Computer und Macht in Organisationen [Computers and power in organizations]. Westdeutscher Verlag, Opladen, Germany Pfeffer J 1981 Power in Organizations. Pitman, Boston, MA Sparkes A C 1990 Power, domination and resistance in the process of teacher-initiated innovation. Research Papers in Education 5: 153–78
H. Altrichter 13598
Schumpeter, Joseph A (1883–1950) Joseph A. Schumpeter is one of the best known and most flamboyant economists of the twentieth century. His main accomplishments are to have introduced the entrepreneur into economic theory through Theorie der wirtschaftlichen Entwicklung (1912) and to have produced a magnificent history of economic thought, History of Economic Analysis (posthumously published in 1954). Schumpeter’s interests were wide ranging, and his writings also include contributions to sociology and political science. His analysis of democracy, which can be found in his most popular work, Capitalism, Socialism, and Democracy (1942), is generally regarded as an important addition to the theory of democracy.
1. Early Life and Career Joseph Alois Schumpeter was born on February 8, 1883 in the small town of Triesch in the AustroHungarian Empire (today Trest in Slovakia). His father was a cloth manufacturer and belonged to a family which had been prominent in the town for many generations; and his mother, who was the daughter of a medical doctor, came from a nearby village. Schumpeter’s father died when he was four years old, an event that was to have great consequences for the future course of his life. Some time later his mother moved to Graz and then to Vienna, where she got married to a retired officer from a well known, aristocratic family, Sigismund von Ke! ler. Schumpeter was consequently born into a small-town bourgeois family but raised in an aristocratic one; and his personality as well as his work were to contain a curious mixture of values from both of these worlds. After having graduated in 1901 from the preparatory school for the children of the elite in the Empire, Theresianum, Schumpeter enrolled at the University of Vienna. His goal from the very start was to become an economist. Around the turn of the century the University of Vienna had one of the best educations in economics in the world, thanks to Carl Menger and his two disciples Eugen von Bo$ hm-Bawerk and Friedrich von Wieser. Schumpeter had the latter two as his teachers, and he also attended lectures in mathematics on his own. When Schumpeter received his doctorate in 1906, he was 23 years old and the youngest person to have earned this degree in the Empire. After having traveled around for a few years and added to his life experience as well as to his knowledge of economics, Schumpeter returned in 1908 to Vienna to present his Habilitationsschrift. Its title was Das Wesen und der Hauptinhalt der theoretischen NationaloW konomie (1908; The Nature and Essence of Theoretical Economics), and it can be characterized as a work in economic methodology from an analytical perspective. In 1909 Schumpeter was appointed as-
Schumpeter, Joseph A (1883–1950) sistant professor at the University of Czernowitz and thereby became the youngest professor in Germanspeaking academia. It was also at Czernowitz that Schumpeter produced his second work, the famous Theorie der wirtschaftlichen Entwicklung (1912; 2nd edn. from 1926 trans. as The Theory of Economic Deelopment). At the University of Graz, to which he moved as a full professor in 1911, Schumpeter produced a third major study, Epochen der Dogmen- und Methodengeschichte (1914; trans. in 1954 as Economic Doctrine and Method ). Over a period of less than ten years Schumpeter had produced three important works, and one understands why he referred to the third decade in a scholar’s life as ‘the decade of sacred fertility.’
1.1 The Works from the Early Years Das Wesen is today best known for being the place where the term ‘methodological individualism’ was coined. In its time, however, this work played an important role in introducing German students to analytical economics, which Gustav von Schmoller and other members of the Historical School had tried to ban from the universities in Germany. Das Wesen is well written and well argued, but Schumpeter would later regard it as a youthful error and did not allow a second edition to appear. The reason for Schumpeter’s verdict is not known, but it is probably connected to his vigorous argument in the book that economics must sever all links with the other social sciences. While the emphasis in Das Wesen had been exclusively on statics, in his second book, Theorie der wirtschaflichen Entwicklung, it was on dynamics. Schumpeter very much admired Walras’ theory of general equilibrium, which he had tried to explicate in Das Wesen, but he was also disturbed by Walras’ failure to handle economic change. What was needed in economics, Schumpeter argued, was a theory according to which economic change grew out of the economic system itself and not, as in Walras, a theoretical scheme in which economic change was simply explained as a reaction to a disturbance from outside the economic system. Schumpeter starts the argument in Theorie by presenting an economy where nothing new ever happens and which therefore can be analyzed with the help of static theory, that is all goods are promptly sold, no new goods are produced or wanted, and profits are zero. Schumpeter then contrasts this situation of a ‘circular flow’ with a situation in which the activities of the entrepreneur sets the whole economic system in motion, and which can only be explained with the help of a dynamic theory. The model that Schumpeter now presented in Theorie would later be repeated and embellished upon in many of his writings: the entrepreneur makes an innovation, which earns him a huge profit, this profit attracts a swarm of other,
less innovative entrepreneurs and as a result of all these activities a wave of economic change begins to work its way through the economic system, a business cycle, in brief, has been set in motion. The theory of business cycles that one can find in Theorie is fairly rudimentary, and Schumpeter would later spend much time in improving upon it. Much of Schumpeter’s analysis in Theorie is today forgotten and few economists are today interested in the way that Schumpeter explains interest, capital and profit by relating these to the activities of the entrepreneur. Some parts of Theorie, however, are still very much alive and often referred to in studies of entrepreneurship. In one of these Schumpeter discusses the nature of an innovation, and explains how it differs from an invention; while an invention is something radically new, an innovation consists of an attempt to turn something—for example, an invention—into a money-making enterprise. In another famous passage Schumpeter discusses the five main forms of innovations: (1) The introduction of a new good … or of a new quality of a good. (2) The introduction of a new method of production … (3) The opening of a new market … (4) The conquest of a new source of supply of raw materials or halfmanufactured goods … (5) The carrying out of the new organization of any industry, like the creation of a monopoly position … or the breaking up of a monopoly position.
Schumpeter’s description of what drives an entrepreneur is also often referred to, as is his general definition of entrepreneurship: ‘the carrying out of new combinations.’ Schumpeter’s third work from these early years is a brief history of economics, commissioned by Max Weber (1914–20) for a handbook in political economy (Grundriss der SozialoW konomik). Its main thesis is that economics came into being when analytical precision, of the type that one can find in philosophy, came into contact with an interest in practical affairs, of the type that is common among businessmen; and to Schumpeter this meeting took place in the works of the Physiocrats. Epochen der Dogmen- und Methodengeschichte was later overshadowed by the much longer History of Economic Analysis, but while the latter work tends to be used exclusively as a reference work, Schumpeter’s study from 1914 can be read straight through and is also more easy to appreciate.
1.2 Difficult Mid-Years By the mid-1910s Schumpeter had established himself as one of the most brilliant economists in his generation, and like other well-known economists in the Austro-Hungarian Empire he hoped for a prominent political position, such as finance minister or economic adviser to the Emperor. During World War I Schumpeter wrote a series of memoranda, which were 13599
Schumpeter, Joseph A (1883–1950) circulated among the elite of the Empire, and through which he hoped to establish himself as a candidate for political office. The various measures that Schumpeter suggested in these memoranda are of little interest today, for example, that the Empire must not enter into a customs union with its powerful neighbor, Germany. What is of considerably more interest, however, is that they provide an insight into Schumpeter’s political views when he was in his early 30s. Schumpeter’s political profile at this time can be described in the following way: he was a royalist, deeply conservative, and an admirer of tory democracy of the British kind. Schumpeter’s attempt during World War I to gain a political position failed, but his fortune changed after the Empire had fallen apart and the state of Austria came into being. In early 1919 Schumpeter was appointed finance minister in a joint Social Democratic and Catholic Conservative government in Austria, led by Karl Renner. Already by the fall of 1919, however, Schumpeter was forced to resign, mainly because the Social Democrats felt that he had betrayed them and could not be trusted. The Social Democrats in particular thought that Schumpeter had gone behind their backs and stopped their attempts to nationalize parts of Austria’s industry. Schumpeter always denied this, and in retrospect it is difficult to establish the truth. What is clear, however, is that the Austrian Social Democrats advocated a democratic and peaceful form of socialism, while Schumpeter detested all forms of socialism. Loath to go back to his position at the University of Graz, Schumpeter spent the early part of the 1920s working for a small bank in Vienna, the Biedermann Bank. Schumpeter did not take part in the everyday activities of the bank but mainly worked as an independent investor. These activities ended badly, however, and by the mid-1920s Schumpeter was forced to leave the Biedermann Bank and had by this time accumulated a considerable personal debt. For many years to come, Schumpeter would give speeches and write articles in order to pay back his debts. Adding to these economic misfortunes was the death in 1926 of his beloved wife Anna (‘Annie’) Reisinger, a 20 years younger working class girl to whom Schumpeter had only been married for a year. From this time on, Schumpeter’s friends would later say, there was a streak of pessimism and resignation in his personality. In 1925 Schumpeter was appointed professor of public finance at the University of Bonn, and it is clear that he was both pleased and relieved to be back in academia. While his ambition, no doubt, extended to politics and business, he always felt an academic at heart. The great creativity that Schumpeter had shown in the 1910s was, however, not to return in the 1920s, and Annie’s death was no doubt an important reason for this. Schumpeter’s major project during the 1920s was a book in the theory of money, but this work was never completed to his satisfaction Schumpeter 1970). 13600
Schumpeter did, however, write several interesting articles during these years, some of which are still worth reading. One of these is ‘The Instability of Capitalism’ (1928), in which Schumpeter suggests that capitalism is undermining itself and will eventually turn into socialism. Another worthwhile article is ‘Gusta s. Schmoller und die Probleme on heute’ (1926), in which Schumpeter argues that economics has to be a broad science, encompassing not only economic theory but also economic history, economic sociology and statistics. Schumpeter’s term for this type of general economics is ‘SozialoW konomik.’ To illustrate what economists can accomplish with the help of sociology, Schumpeter wrote an article on social classes (1927), which is still often referred to. Schumpeter starts out by drawing a sharp line between the way that the concept of class is used in economics and in sociology. In economics, Schumpeter says, class is basically used as a category (e.g., wage-earner\nonwage-earner), while in sociology class is seen as a piece of living reality. An attempt was also made by Schumpeter to link his theory of the entrepreneur to the idea of class. In mentioning Schumpeter’s articles on social classes, something must also be said about two of his other famous studies in sociology: ‘The Fiscal Crisis of the Tax State’ (1918) and ‘The Sociology of Imperialisms’ (1918–19). The former study can be described as a pioneer effort in the sociology of taxation, which focuses on the situation in Austria just after World War I but which also attempts to formulate some general propositions about the relationship between taxation and society. Schumpeter, for example, suggests that if the citizens demand more and more subventions from the state, but are unwilling to pay for these through ever higher taxes, the state will collapse. In the article on imperialism, Schumpeter argues that imperialism grows out of a social situation which is not to be found in capitalism; and whatever imperialist forms that still exist represent an atavism and a leftover from feudal society. Schumpeter’s famous definition of imperialism is as follows: ‘imperialism is the objectless disposition on the part of the state to unlimited forcible expansion.’
2. The Years in the United States By 1930 Schumpeter was still depressed about his life and quite unhappy about his career; he had found no one to replace his wife on an emotional level and he felt that he deserved a better professional position than Bonn. When Werner Sombart’s chair at the University of Berlin became vacant, Schumpeter had high hopes that he would get it. When this did not happen, however, he instead accepted an offer from Harvard University; and in 1932 he moved to Cambridge, Massachusetts for good. Schumpeter was slowly to regain his emotional composure, and also his capacity
Schumpeter, Joseph A (1883–1950) to work, in the United States. In 1937 he got married to economist Elizabeth Boody, who regarded it as her duty in life to take care of Schumpeter and manage his worldly affairs. With the support of his wife, Schumpeter produced three major works: Business Cycles (1939), Capitalism, Socialism and Democracy (1942), and History of Economic Analysis (published posthumously in 1954).
2.1 The Works from the American Period Of these three works, the one for which Schumpeter had the highest hopes was definitely Business Cycles (1939), which appeared exactly 25 years after his last book in Europe. This work is more than 1100 pages long and it traces business cycles in Great Britain, the United States and Germany during the years 1787– 1938. Schumpeter had hoped that Business Cycles would get the kind of reception that Keynes’ General Theory got, and he was deeply disappointed by the lack of interest that his colleagues and other economists showed. The impression one gets from histories of economic thought, however, is that Business Cycles is rarely read today, mainly because few people are persuaded by Schumpeter’s theory that business cycles are set off by innovations and that their motion can be best described as a giant Kondratieff cycle, modified by Juglar and Kitchin cycles; Schumpeter’s handling of statistics is also considered poor. To a large extent this critique is true, although it should also be noted that a number of themes and issues in this and other of Schumpeter’s works are increasingly being referred to in evolutionary economics. After many years of excruciatingly hard work on Business Cycles, Schumpeter decided that he needed an easier task to work on than the recasting of economic theory that was on his agenda since the time in Bonn. In the late 1930s Schumpeter therefore began working on what he saw as a minor study of socialism. This minor study, however, quickly grew in size, and when it finally appeared in 1942—under the title Capitalism, Socialism, and Democracy—it was nearly 400 pages long. Of all of Schumpeter’s works (1942), this book was to become the most popular one, especially in the United States where it is still considered a standard work in political science. Capitalism, Socialism, and Democracy has also been translated into more languages than any other work by Schumpeter, and it is still in print. Capitalism, Socialism, and Democracy contains an interesting history of socialist parties (part V), a fine discussion of the nature of socialism (part III), and an excellent discussion of the work of Karl Marx (part I). What has made this work into a social science classic, however, are its two remaining parts with their discussion of capitalism (part II) and the nature of democracy (part IV). Capitalism, Schumpeter says, can be characterized as a process of continuous
economic change, through which new enterprises and industries are continuously being created and old ones continuously being destroyed. Schumpeter’s famous term for this process is ‘creative destruction.’ Perfect competition of the type that economists talk about, he also says, is a myth; and monopolies, contrary to common belief and economists’ dogma, are often good for the economy. Without monopolies, for example, very expensive forms of research and investments would seldom be undertaken. Schumpeter’s theory of monopolies has led to a large number of studies of innovation. By ‘the Schumpeterian hypothesis’ two propositions are meant: that large firms are more innovative than small firms, and that innovation tends to be more frequent in monopolistic industries than in competitive ones. Each of these hypotheses can in their turn be broken down into further hypotheses. Many decades later the results of all of these studies are still inconclusive, and there is no consensus if monopolistic forms of enterprise further or obstruct innovation (for an overview of research on the Schumpeterian hypothesis, see Kamien and Schwartz 1982). Schumpeter’s overall theory in Capitalism, Socialism, and Democracy—the idea that capitalism is on its way to disappear and be replaced by socialism—has, on the other hand, led to little concrete research. One reason for this is probably that many readers of Schumpeter’s book have felt that his argument is very general in nature and at times also quite idiosyncratic. Schumpeter argues, for example, that capitalism tends to breed intellectuals, and that these are hostile to capitalism; that the bourgeois class is becoming unable to appreciate property; and that the general will of the bourgeoisie is rapidly eroding. ‘Can capitalism survive?’ Schumpeter asks rhetorically, and then gives the following answer, ‘No. I do not think it can.’ Schumpeter’s conclusion about the inevitable fall of capitalism has often been sharply criticized (e.g., Heertje 1981). His more general attempt to tackle the question of how social and economic institutions are needed for an economic system to work properly has, however, remained unexplored. There also exists a certain tendency to cite catchy phrases from Schumpeter’s work without paying attention to their organic place in his arguments. What has always been much appreciated in Capitalism, Socialism, and Democracy is Schumpeter’s theory of democracy. Schumpeter starts out by criticizing what he calls ‘the classical doctrine of democracy,’ which he characterizes in the following way: ‘the democratic method is that institutional arrangement for arriving at political decisions which realizes the common good by making the people itself decide issues through the election of individuals who are to assemble in order to carry out its will.’ What is wrong with this theory, he argues, is that there is no way of establishing what the common good is. There is also the additional fact that politicians do not have interests 13601
Schumpeter, Joseph A (1883–1950) of their own, according to this theory; their function is exclusively to translate the desires of the people into reality. To the classical type of democracy Schumpeter counterposes his own theory of democracy, which he defines in the following manner: ‘the democratic method is that institutional arrangement for arriving at political decisions in which individuals acquire the power to decide by means of a competitive struggle for the people’s vote.’ Some critics have argued that Schumpeter’s theory is elitist in nature, since democracy is reduced to voting for a political leader at regular intervals. This may well be true, and Schumpeter’s theory of democracy no doubt coexists in his work with contempt for the masses. Regardless of this critique, however, Schumpeter’s theory is generally regarded as being considerably more realistic than the classical doctrine of democracy. World War II was a very trying time for Schumpeter since he felt emotionally tied to German speaking Europe, even though he by now was a US citizen. Schumpeter did not support the Nazis but like many conservatives felt that Communism constituted a more dangerous enemy than Hitler. Schumpeter’s intense hatred of Roosevelt, whom he suspected of being a socialist, also put him on a collision course with the academic community in Cambridge during World War II. In his diary Schumpeter expressed his ambivalence towards Hitler, Jews, Roosevelt and many other things; and in his everyday life he more and more withdrew into solitude and scholarship. The War, in brief, was a very difficult and demoralizing time for Schumpeter. From the early 1940s Schumpeter started to work feverishly on a giant history of economic thought that was to occupy him till his death some ten years later. He did not succeed in completing History of Economic Analysis, which was instead put together by Elizabeth Boody Schumpeter after several years of work. History of Economic Analysis is many times longer than Schumpeter’s history of economics from 1914, and it is also considerably more sophisticated in its approach. Schumpeter’s ambition with this work, it should first of all be noted, was not only to write the history of economic theory, but also to say something about economic sociology, economic history, and economic statistics, the four ‘fundamental fields’ of economics or SozialoW konomik. It should also be mentioned that History of Economic Analysis begins with a very interesting methodological discussion of the nature of economics. This is then followed by a brilliant, though occasionally willful discussion of economics from classical Greece to the years after World War I. The section on twentieth century economics was unfortunately never completed, but it is clear that Schumpeter was very discontent with the failure of his contemporaries to create a truly dynamic economic theory. Schumpeter died in his sleep on January 8, 1950 and the cause was put down as cerebral hemorrhage. Some 13602
of his friends thought that the real reason was overwork, while his diary shows that he had been tired of life for a long time. Schumpeter’s private notes also show that behind the image of brilliant lecturer and iconoclastic author, which Schumpeter liked to project, there was a person who felt deeply unhappy with the way that his life had evolved and with his failure to produce a theory that would revolutionize economic thought. ‘With a fraction of my ideas,’ he wrote in dismay a few years before his death, ‘a new economics could have been founded.’ See also: Business History; Capitalism; Democracy; Democracy, History of; Democracy: Normative Theory; Democratic Theory; Determinism: Social and Economic; Development, Economics of; Development: Social; Economic Growth: Theory; Economic History; Economic Sociology; Economic Transformation: From Central Planning to Market Economy; Economics, History of; Economics: Overview; Economics, Philosophy of; Entrepreneurship; Entrepreneurship, Psychology of; Imperialism, History of; Imperialism: Political Aspects; Innovation and Technological Change, Economics of; Innovation, Theory of; Marxian Economic Thought; Policy History: State and Economy; Political Economy, History of; Social Democracy; Socialism; Statistical Systems: Economic
Bibliography Allen R L 1991 Opening Doors: The Life and Work of Joseph Schumpeter. Transaction Publishers, New Brunswick, NJ Augello M 1990 Joseph Alois Schumpeter: A Reference Guide. Springer-Verlag, New York Harries S E (ed.) 1951 Schumpeter: Social Scientist. Books for Libraries Press, Freeport, New York Heertje A (ed) 1981 Schumpeter’s Vision: Capitalism, Socialism and Democracy After 40 Years. Praeger, New York Kamien M I, Schwartz N L 1982 Market Structure and Innoation. Cambridge University Press, Cambridge, UK Schumpeter J A 1908 Das Wesen und der Hauptinhalt der theoretischen NationaloW konomie. Duncker and Humblot, Leipzig Schumpeter J A 1912 Theorie der wirtschaftlichen Entwicklung. Duncker & Humblot, Leipzig (actually appeared in 1911, 2nd edn. 1926). [trans. into English in 1934 as The Theory of Economic Deelopment, Harvard University Press, Cambridge, MA] Schumpeter J A 1914 Epochen der Dogmen- und Methodengeschichte. In: Bu¨cher et al. (eds.) Grundriss der Sozialo¨konomik. I. Abteilung, Wirtschaft und Wirtschaftswissenschaft. J.C.B. Mohr, Tu$ bingen, Germany, pp. 19–124 Schumpeter J A 1926 Gustav vs. Schmoller und die probleme von heute. Schmollers Jahrbuch 50: 337–88 Schumpeter J A 1939 Business Cycles: A Theoretical, Historical and Statistical Analysis of the Capitalist Process. McGrawHill, New York Schumpeter J A 1942 Capitalism, Socialism, and Democracy. Harper and Brothers, New York
SchuW tz, Alfred (1899–1959) Schumpeter J A 1954 History of Economic Analysis. Oxford University Press, New York Schumpeter J A 1954 Economic Doctrine and Method, Oxford University Press, New York Schumpeter J A 1970 Das Wesen des Geldes. Vandenhoeck and Ruprecht, Go$ ttingen, Germany Schumpeter J A 1989 The instability of capitalism. In: Schumpeter J A (ed.) Essays. Transaction Publishers, New Brunswick, NJ Schumpeter J A 1991 The crisis of the tax state. In: Schumpeter J A (ed.) The Economics and Sociology of Capitalism. Princeton University Press, Princeton, NJ Schumpeter J A 1991 The sociology of imperialisms. In: Schumpeter J A (ed.) The Economics and Sociology of Capitalism. Princeton University Press, Princeton, NJ Schumpeter J A 1991 Social classes in an ethically homogenous milieu. In: Schumpeter J A (ed.) The Economics and Sociology of Capitalism. Princeton University Press, Princeton, NJ Schumpeter J A 1991 The Economics and Sociology of Capitalism. Princeton University Press, Princeton, NJ Schumpter J A 2000 Briefe\Letters. J.C.B. Mohr, Tu$ bingen, Germany Stolper W F 1994 Joseph Alois Schumpeter: The Priate Life of a Public Man. Princeton University Press, Princeton, NJ Swedberg R 1991 Schumpeter: A Biography. Princeton University Press, Princeton, NJ Weber M (ed.) 1914–20 Grundriss der SozialoW konomik. J.C.B. Mohr, Tu$ bingen, Germany
R. Swedberg
Schu$ tz, Alfred (1899–1959) Alfred Schu$ tz was born in Vienna on April 13, 1899. Here he studied law and social sciences from 1918 to 1921. After receiving his Doctorate of Law he continued his studies in the social sciences until 1923. His teachers included Hans Kelsen, an eminent lawyer, Ludwig von Mises, and Friederich von Wieser, prominent representatives of the Austrian school of economics. Equally important as his university studies was his participation in the informal academic life of Vienna, e.g., his participation in the private seminars of von Mises, where Schu$ tz also cultivated his philosophical interests. In addition to his scholarly activities, Schu$ tz held a full-time job as a bank attorney, a dual life which lasted until 1956. After the annexation of Austria to the Third Reich, Schu$ tz and his family escaped to New York. On his arrival, Schu$ tz got in touch with Talcott Parsons whose Structure of Social Action he intended to review. But despite their intense correspondence (Schu$ tz and Parsons 1978), the contact between Schu$ tz and Parsons broke down. Instead, Schu$ tz became affiliated with the Graduate Faculty at the New School for Social Research in New York where he started teaching as Lecturer in 1943, advancing to a full Professor of Sociology in 1952 and also of Philosophy in 1956. He died on May 20, 1959 in New York.
1. Major Aims of SchuW tz’s Theory Alfred Schu$ tz’s social theory focuses on the concept of the life world, which is understood as a social-cultural world as it is perceived and generated by the humans living in it. Schu$ tz identifies everyday interaction and communication as being the main processes in which the constitution of the life world’s social reality takes place. In this sense, his approach pursues two major goals: the development of a theory of action which reveals the constitution of social reality with its meaning structure given in the commonly shared typical patterns of knowledge, and a description of the life world with its multiple strata of reality emerging from these constituting processes. These leitmotifs merge in a theory of life world constitution—a project which occupied Schu$ tz during the final years of his life.
2. Main Intellectual Sources of the SchuW tzian Approach Schu$ tz’s conception originated in the intellectual discourse in the social sciences and philosophy of the early decades of this century. Schu$ tz developed his position from within a triangular interdisciplinary field marked by Max Weber’s historising interpretative sociology (erstehende Soziologie), the Austrian economic approach represented by Ludwig von Mises, and the philosophical theories of the stream of consciousness formulated by H. Bergson and E. Husserl. From the very onset, he adopted Max Weber’s view of social reality as a meaningful social-cultural world, as well as his concept of social action as a meaning oriented behavior, and shared Weber’s search for an interpretative method in sociology. He also shared the methodological individualism advocated by Weber and by the Austrian economists who denoted individual action as the starting point of social research. At the same time, he accepted the need for a generalizing theory of action, as stressed by his teacher Ludwig von Mises, and criticized Weber for neglecting to develop the basic framework of such a theory and especially for not inquiring into the constitution of meaning attached to social action. But Schu$ tz’s criticism also addressed the ‘general theory of action’ based on an ‘a priori’ postulate of rational choice put forth by Ludwig von Mises. Schu$ tz rejected this conception since it imposed an abstract framework on actors’ orientation of action and ignored their actual stock of knowledge. In order to analyze how the meaning attached to action is revealed, Schu$ tz referred to the philosophical concepts of Henri Bergons and Edmund Husserl. Both offer insights into the stream of lived experience and into the acts of consciousness in which our world is meaningfully constituted. Schu$ tz adopted the Bergsonian idea of the stream of consciousness in his first scholarly attempts in 1924–28 (Schu$ tz 1982), only to later recognize the difficulties in Bergson’s intuitivism 13603
SchuW tz, Alfred (1899–1959) and to turn to Husserl’s phenomenology which ultimately gave the name to his approach. Husserl was concerned with the acts of consciousness which establish the meaningful taken-for-grantedness of the world in humans’ natural attitude. His term ‘life world,’ which Schu$ tz later assumes, is aimed at the reality constituted in those acts. Based primarily on this philosophical approach, Schu$ tz wrote his first monograph, Der sinnhafte Aufbau der sozialen Welt, in 1932, in which he developed the basic concept of his theory (Schu$ tz 1932). Starting with the criticism on Max Weber mentioned above, he devoted himself to the process in which humans create the social world as a reality that is meaningful and understandable to them.
3. Theory of the Life World with its Eeryday Core In order to grasp the constitution of the social world, Schu$ tz (1932) had to transcend the realm of consciousness and perception analyzed by Husserl. He included both acts of consciousness as well as human action and interaction into the constitutive process under scrutiny. Departing from the transcendental philosophical approach, he developed his own mundane phenomenology which analyzes the constitution of the meaningful reality within social relationships in the everyday world. He proceeded in three steps dealing with three problems constitutive for any theory of action which is searching for the construction of social reality. (a) How does a meaningful orientation of human action emerge? (b) How can we understand the Other? (c) How is socially-shared, intersubjective valid knowledge generated? Following the principle of methodological individualism, Schu$ tz started with the analysis of single individual action. Relying on Bergson’s concept of the inner stream of lived experience and on Husserl’s investigations on intentionality and temporality of consciousness, Schu$ tz first explored the constitution of meaning in subjects as a temporal process. The meaning attached to a lived experience emerges from successive acts of selective reflection aimed towards it and framing it into a context of other experiences. The schemes of experience on which the primary meaning of action, i.e., its project is based, arise from this basic temporal dynamics and plasticity of consciousness. The temporality of the meaning attached to an action manifests itself in two kinds of motivation attached to its different temporal dimensions: ‘in-order-to-moties’ which direct action toward the future, and ‘becausemoties’ representing the roots of the action in the past. However, the meaning of the project of action changes during the time when the projected action is 13604
going on. The final meaning of an action can be found in the experience perceived by the subject when looking at the changes that have emerged in the meaning structure of the action when the action is completed. The meaning attached to an action and consequently the schemes of experience are thus not only affected by the acts of consciousness but also by the action itself. In his second step, Schu$ tz proceeded to show how the schemes of experience are shaped by interaction and communication. He perceived communication as a process in which two subjective streams of consciousness are coordinated within a social interaction (Wirkensbeziehung). Thus, communication signifies an interaction in which the meaning of ego’s action consists in the intention to evoke a reaction of the alter. Actions in this sense have the function of signs which are mutually indicated and interpreted. The ultimate meaning of my action is revealed in the reaction of the Other and vice versa, therefore communication generates a chain of moties where my inorder-to-moties become because-moties of the Other and provide a common stock of shared patterns of interpretation which allows for mutual understanding even if each of the actors are always referring to their own schemes of experience. In this concept of understanding, based on interaction, Schu$ tz offers his own solution to the problem of intersubjectivity posed by Husserl. The constitution and appropriation of shared knowledge primarily takes place in long-lasting faceto-face interaction (we-relations) where the mutual expectations are learned, verified, and sedimented to typical patterns that can be applied to more remote and anonymous strata of the social world. As a result, the meaning structure of the social world is characterized by typifications of actions, situations, and persons generated in interaction and communication. In these three steps, Schu$ tz laid down the main features of his theory of the life world. In his later work, Schu$ tz (1962, 1964, 1966) determined this communicatively created social reality as the world of everyday life, in which typical patterns are taken for granted, and which represents the intersubjective common core of the reality we live in. Later on (Schu$ tz 1962, 1970), he disclosed further structural characteristics of this everyday core of the life world. Since its typical structure greatly depends on action, it is also the pragmatic orientation selecting the areas where typification processes take place. Both typicality and this kind of selection, which Schu$ tz designated as pragmatic releance, represent two generative principles of order in the everyday world. They determine its formal structure and at the same time, when realized in action, they shape this structure into a concrete social–cultural world, which is thus characterized by social distribution and differentiation of knowledge. A third moment structuring the everyday world can be found in the rootedness of its constitution in individual action. Here actors and their bodies represent the
SchuW tz, Alfred (1899–1959) central point of the eeryday world and its temporal, spatial, and social dimensions that are arranged into spheres of past and future, of within-reach and distant and of intimacy and anonymity, respectively, to the ctors’ own position (Schu$ tz and Luckmann 1973). Everyday reality is nevertheless not identical with the life world on the whole. By suspending their pragmatic interests, actors are able to modify their everyday experiences and perceive them as objects of a game, fantasy, art, science, or, if the conscious attention is completely absent, as a dream. There are areas—even in the realm of everyday action directed by the principal of pragmatic relevance—which are beyond the sphere of actors everyday practice and therefore transcend it. All these modifications represent different provinces of meaning transcending the everyday world and constituting multiple realities (Schu$ tz 1962) which make up the life world. Nevertheless, among the different provinces of the life world, the everyday core denotes a paramount reality since it is only here that actors, driven by the fundamental anxiety when facing the finality of their lives, have to master their living conditions and are engaged in communicative processes producing common knowledge which makes mutual understanding possible. Schu$ tz (1962) viewed communication as a substantial constitutive mechanism of social reality and stressed the role of language in this process. He considered language—including its syntax and semantics—as an objectivation of sedimented, social provided stock of knowledge that preserves relevances and typification inherent to cultures and social groups, and thus as crucial for the constitution of the life world (Schu$ tz and Luckmann 1989). Schu$ tz saw language as delineating an essential case of social objectified systems of appresentation which bridges the transcendence between different areas and meaning provinces of the life world. Another important integrative mechanism in the life world are symbolic structures, which are often based on language, and intermediate between the everyday world and noneveryday realities such as religion, arts, or politics (Schu$ tz 1962). Schu$ tz did not restrict his study of the structure of the life world solely to theoretical research. He also applied his theory as a general interpretative scheme to several social fields and problems. He explored courses of intercultural communication and the social distribution of knowledge, the social conditions of equality in modern societies as well as the fields of music and literature (Schu$ tz 1964).
4. Consequences of the Theory of Life World for the Methodology of the Social Sciences Schu$ tz considered the construction of ideal types to be the general methodological device in the social sciences since sociology, economy, political science, etc., ex-
plain social phenomena by modeling ideal actors and courses of action which they endorse with special features as explanatory variables. However, he did not see ideal types in the Weberian sense, i.e., as a scientific model which does not need to correspond to reality in all respects. For Schu$ tz, the ideal-typical method is legitimized by his findings on the typicality of everyday world which is the object of social research. Social reality can be approached by type-construction on the scientific level because its immanent structure is typified itself. Thus, the Schu$ tzian theory of the life world and its structures represents a methodological tool to bridge the gap between sociological theoretical reasoning and its object. The methodological rule which Schu$ tz derived from his theoretical approach consists correspondingly in the postulate of adequacy (Schu$ tz 1962, 1964) between everyday and scientific typifications. This postulate holds that ideal types featured by the social sciences have to be constructed in correspondence to the structure of the everyday typifications, so that everyday actors can take the model for granted when they act under the conditions stated in the ideal type. In this sense, social scientists as well as everyday actors have to follow the same frame of reference given by the structure of the life world but their cognitive attitude differ in several respects: scientists neither share the pragmatic interest of everyday actors nor do they share their everyday rationality restricted by their beliefs in the grantedness of typical knowledge. Opposing T. Parson’s functionalism and C. G. Hempel’s methodologic positivism, Schu$ tz stressed that scientists must not impose their own theoretical concepts and rationality on their object of study—the life world. First, they must discover and then respect its immanent meaning structure.
5. The Significance of the SchuW tzian Approach for the Social Sciences Since the 1960s, Schu$ tz’s theory has drawn sociologists’ attention to inquiries into everyday interaction, communication as well as to the insight that social reality has to be considered as a construction produced within these processes. Sociologists started to examine the practices of everyday action, communication and interpretation from which social reality emerges. H. Garfinkel’s (1967) Ethnomethology, which was aimed at finding the formal properties of everyday practices, led to a series of case studies covering a wide range of everyday life in society and its institutions. Continuing this line of research, examinations of everyday communication were provided by E. A. Shegloff and H. Sacks (1995), whose Conersational Analysis became a widespread method in qualitative sociological research. A. Cicourel’s (1964) Cognitie Sociology revealed the constructed character of data in social institutions as well as in science and thus initiated a 13605
SchuW tz, Alfred (1899–1959) series of studies in the sociology of organization and in the sociology sciences. Milieus as formations of everyday interaction are the subject of the Social Phenomenology R. Grathoff (1986). P. L. Berger and Th. F. Luckmann’s (1966) idea of the Social Construction of Reality (as being a general process in which cultural worlds emerge) triggered new impulses in both the sociology of knowledge and the sociology of culture reconceived now in Schu$ tzian terms. In this context, Schu$ tz’s impact can also be seen in the sociology of language and the sociology of religion. Contemporary Marxian theory saw the way in which Schu$ tz focused on the everyday practice as a possible mediation between social structure and individual consciousness. The term phenomenological sociology was coined in the 1970s (G. Psathas 1973) to address the spectrum of approaches oriented and inspired by Schu$ tz’s theory. Under this label, Schu$ tz’s approach became one of the general paradigms in the interpretative social science and theory of action. The diffusion and empirical application of the Schu$ tzian approach enforced the search for qualitative research methods which would reveal data pertaining to the social construction of social reality in everyday life. Aside from ethnomethodology and conversational analysis mentioned above, this quest especially led to a refinement in the techniques of narrative interviews and in biographical research. Once established in the 1970s, the Schu$ tzian paradigm influenced the mainstream of sociological theory which became sensitive not only to the social construction of life world but also to the phenomenological background of the Schu$ tzian theory. The ‘life world,’ in the sense of a basic social reality provided by humans in their ‘natural’ intercourse, became one of the central terms in social theory (J. Habermas 1981). The everyday construction of social reality was recognized as a crucial mechanism in which society emerges (P. Bourdieu 1972, A. Giddens 1976, Z. Baumann 1991). The phenomenological conception of meaning constitution reformulated as an autopoiesis (selfcreation) of social and psychic systems influenced the development of the contemporary sociological system theory (N. Luhmann 1996). Beyond of the scope of sociology, other humanities also gained innovative impulses from the Schu$ tzian approach. In philosophy, Schu$ tz’s theory led to a critical assessment of the Husserlian view of intersubjectivity and to the conceptions of a worldly phenomenology (L. Embree 1988) or of a philosophy of modern anonymity (M. Natanson 1986). In the literature, Schu$ tz’s constructionism inspired the esthetics of reception which pointed out the beholder’s participation in co-creating the autonomous reality of literary works (H. R. Jauss 1982). Schu$ tz’s concept of the structure of life world also affected theorizing in social geography (B. Werlen 1993), educational theory (K. Meyer-Drawe 1984), and political science (E. Voegelin 1966). 13606
See also: Constructivism\Constructionism: Methodology; Culture, Sociology of; Ethnomethodology: General; Everyday Life, Anthropology of; Husserl, Edmund (1859–1938); Interactionism: Symbolic; Interpretive Methods: Micromethods; Knowledge, Sociology of; Methodological Individualism: Philosophical Aspects; Phenomenology in Sociology; Phenomenology: Philosophical Aspects; Verstehen und Erkla$ ren, Philosophy of; Weber, Max (1864–1920)
Bibliography Bauman Z 1991 Modernity and Ambialence. Polity Press, Oxford, UK Berger P L, Luckmann Th 1966 The Social Construction of Reality. Doubleday, Garden City, New York Bourdieu P 1972 Esquisse d’une TheT orie de la Pratique, preT ceT deT de trois eT tudes d’ethnologie kabyle. Droz S.A., Geneva, Switzerland Cefai D 1998 Phe´nome´nologie et les Sciences Sociales: la Naissance d’une Philosophie Anthropologique. Libraire Droz, Geneva, Switzerland Cicourel A V 1964 Method and Measurement in Sociology. The Free Press of Glencoe, New York Embree L 1988 Wordly Phenomenology: The Continuing Influence of Alfred Schutz on North American Human Sciences. University Press of America, Washington DC Garfinkel G 1967 Studies in Ethnomethodology. Prentice-Hall, Englewood Cliffs, NJ Giddens A 1976 New Rules of Sociological Method. Hutchinson & Co., London, UK Grathoff R 1986 Milieu und Lebenswelt. Suhrkamp, Frankfurt\M, Germany Habermas J 1981 Theorie des kommunikatien Handelns. Suhrkamp, Frankfurt\M, Germany Jauss H P 1982 Aq sthetische Erfahrung und literarische Hermeneutik. Suhrkamp, Frankfurt\M, Germany Luhmann N 1996 Die neuzeitlichen Wissenschaften und die PhaW nomenologie. Picus, Vienna, Austria Meyer-Drawe K 1984 Leiblichkeit und SozialitaT t. Fink, Munich, Germany Natanson M 1986 Anonymity: A Study in The Philosophy of Alfred Schutz. Indiana University Press, Bloomington, IN Psathas, G (ed.) 1973 Phenomenological Sociology. Issues and Applications. John Wiley & Sons, New York Sacks H 1995 Lectures on Conersation. Blackwell, Oxford, UK Schu$ tz A 1932 Der sinnhafte Aufbau der sozialen Welt. Springer, Vienna Schu$ tz A 1962 Collected Papers I. Nijhoff, The Hague, The Netherlands Schu$ tz A 1964 Collected Papers II. Nijhoff, The Hague, The Netherlands Schu$ tz A 1966 Collected Papers III. Nijhoff, The Hague, The Netherlands Schu$ tz A 1970 Reflections on the Problem of Releance. Yale University Press, New Haven, CT Schu$ tz A 1982 Life Forms and Meaning Structure. Routledge and Kegan Paul, London Schu$ tz A 1995 Collected Papers IV. Nijhoff, The Hague, The Netherlands Schu$ tz A, Luckmann T 1973 Structures of Life World I. Northwestern University Press, Evanston, IL
Science and Deelopment Schu$ tz A, Luckmann T 1989 Structures of Life World II. Northwestern University Press, Evanston, IL Schu$ tz A, Parsons T 1978 Theory of Social Action: The Correspondence of Alfred SchuW tz and Talcott Parsons. Indiana University Press, Bloomington, IN Srubar I 1988 Kosmion die Genese der pragmatischen Lebensweltheorie on Alfred SchuW tz und ihr anthropologischer Hintergrund. Suhrkamp, Frankfurt am Main, Germany Wagner H R 1983 Alfred SchuW tz. An Intellectual Biography. University of Chicago Press, Chicago Werlen B 1993 Society, Action and Space. An Alternatie Human Geography. Routledge, London Voegelin E 1966 Anamnesis. Zur Theorie der Geschichte und Politik. Piper, Munich
I. Srubar
Science and Development Duringthepasthalf-century,conventionalunderstandings of science, development, and their relationship have changed radically. Formerly, science was thought to refer to a clear and specific variety of Western knowledge with uniformly positive effects on society. Formerly, development was viewed as a unidirectional process of social change along Western lines. Formerly, science was viewed as a powerful contributor to the developmental process. Each of these ideas has been subjected to insightful criticism. This article will examine science and development, concluding that the relationship between the two is problematic, partly because of the complexity of the concepts themselves. Three major theories of development are considered, together with the main types of research institutions in developing areas.
1. Science Much of what is termed science in developing areas is far from what would be considered ‘pure science’ in the developed world. The ‘root concept’ of science involves research, the systematic attempt to acquire new knowledge. In its modern form, this involves experimentation or systematic observation by highly trained specialists in research careers, typically university professors with state-of-the-art laboratory equipment. These scientists seek to contribute to a cumulative body of factual and theoretical knowledge, testing hypotheses by means of experiments and reporting their results to colleagues through publication in peer-reviewed journals. Yet when a new variety of seed is tested by a national research institute and distributed to farmers in Africa, this is described as the result of ‘science.’ When the curator of a botanical exhibit has a college
degree, he or she may be described as ‘the scientist.’ When a newspaper column discusses malaria or AIDS, ‘scientific treatments’ are recommended. Seeds, educated people, and advice are not science in the abstract and lofty sense of the pursuit of knowledge for its own sake or systematically verified facts about the world. But they are science from the standpoint of those who matter—local people who spend scarce resources on their children’s education, development experts who determine how and where to spend funds, politicians who decide whether to open a new university, corporate personnel who open a new factory in a developing region. Perhaps the most important shift in recent thinking about science is a broadening of the scholarly view to include the ideas of science found among ordinary people. These are often more extended in developing areas, because of the association of science with ‘modern’ things and ideas. ‘Science’ in its extended sense includes technological artifacts, trained expertise, and knowledge of the way the world works. The importance of this point will be clear in the conclusion. Given the fuzziness of the boundaries that separate science from other institutions, and the dependence of modern research on sophisticated technical equipment, the term ‘technoscience’ is often used to denote the entire complex of processes, products, and knowledge that flows from modern research activities. Even if we recognize that the term ‘science’ has extended meanings, it is useful to draw a distinction between (a) the institutions that produce knowledge and artifacts and (b) the knowledge that is produced. That is, on the one hand, there are organizations, people, and activities that are devoted to the acquisition of knowledge and things that can be produced with knowledge. These constitute the modern organization of research. On the other hand, there are claims involving knowledge and artifacts—often significantly transformed as they leave the confines of the research laboratory. What makes claims and practices ‘scientific’ is their association with scientific institutions. Modern research capacity is concentrated in industrialized countries. Indeed, with respect to the global distribution of scientific and technical personnel, scientific organizations, publications, citations to scientific work, patents, equipment, and resources, scientific institutions display extremely high degrees of inequality. The most common indicator of scientific output is publications. In 1995, Western Europe, North America, Japan, and the newly industrialized countries accounted for about 85 percent of the world total. Leaving aside countries and allies of the former Soviet Union, developing areas contributed less than 9 percent of the world total. Much the same applies to technological output measured in patents and expenditures on research and development (UNESCO 1998). Yet if we shift our focus from the question of inequality to the question of diffusion, an entirely different picture arises. To what extent have the idea 13607
Science and Deelopment and practice of research spread throughout the world? The main issues here involve who conducts research, on what subjects, and what happens to the results. Each of these topics is the subject of analysis and controversy. Scientific research in developing countries began during the pre-independence era with the establishment of universities and research institutes. Research was conducted on crops and commodities for export as well as conditions (e.g., disease) that affected the profits sought by external agents from their control over the land, labor, and property of colonized peoples. Methodologies and organizational models for research were brought by European colonists to Asia, Africa, and Latin America. During the era of independence and throughout the 1970s, number of types of entities engaged in the generation of knowledge multiplied. The main scientific organizations now fall into five main types, or sectors: academic departments, state research institutes, international agencies, private firms and, to a lesser degree, nongovernmental organizations.
2. Deelopment The concept of development involves several dimensions of transformation, including the creation of wealth (that is, rapid and sustained economic growth) and its distribution in a fashion that benefits a broad spectrum of people rather than a small elite (that is, a reduction in social inequality). Cultural transformation (recognition of and attendant value placed on local traditions and heritage) has also been viewed as an important aspect of the process since the early 1980s. There is general agreement that development in the second half of the twentieth century is not a mere recapitulation of the process of industrialization that characterized Europe and North America in the eighteenth and nineteenth centuries. Three theoretical perspectives, with many variations, have dominated development studies: modernization, dependency, and institutional. One way of distinguishing these theories is by their position on the ways in which relationships external to a country affect the process of change. Since scientific institutions and knowledge claims are of external origin, each of these perspectives views science and technology as important in the development process, with very different assessments of the costs and benefits.
2.1 Modernization The oldest approach, sometimes called modernization theory, focused on the shift from a traditional, rural, agricultural society to a modern, urban, industrial form. Transformations internal to a country (such as formal education, a market economy, and democratic 13608
political structures) are emphasized, while external relationships are de-emphasized. However, science was the exception to this, available to benefit developing nations through technology transfer from Western sources. This idea relied on two assumptions. One was the ‘hardness’ of technological artifacts—their alleged independence from people and culture, their seeming ability to produce certain effects ‘no matter what.’ The second was the ‘linear model’ of technology development in which the (a) discoveries of basic science lead to the (b) practical knowledge of applied science and finally to (c) technological applications such as new products. In retrospect, both of these assumptions were simplistic in any context, but in the developing world they were especially problematic. The assumption of ‘hardness’ has been replaced by the generalization that the uses, effects, and even the meanings of technological artifacts are affected by the context of use. First, effective technologies, from automobiles to indoor plumbing, do not typically stand alone, but are embedded in systems that provide infrastructure (roads, sewage treatment) which is often lacking. Second, the provision of artifacts such as buildings and computers is much easier than their maintenance, whichrequiresboth resources and knowledge. Third, introduction of new technology involves a multiplicity of consequences—positive and negative, short term and long term, economic and ecological. Many of these consequences are unpredictable, even in those rare cases where such foresight is attempted. The case of the Green Revolution is illustrative. In the 1960s, widespread food shortages, population growth, and predicted famine in India prompted major international foundations to invest research and technology transfer efforts towards the goals of increasing agricultural productivity and the modernization of technology. What resulted were new kinds of maize, wheat, and rice. These modern varieties promised higher yields and rapid maturity, but not without other inputs and conditions. They were, rather, part of a ‘package’ that required fertilizers as well as crop protection inputs such as pesticides, herbicides, and fungicides—sometimes even irrigation and mechanization. Moreover, seed for these varieties had to be purchased anew each year. The consequences of the Green Revolution are still debated, and there is little doubt that many of them were positive. Famine in India was averted through increased yields, but the benefits of the technology required capital investments that were only possible for wealthier farmers. Not only did the adoption of new technology increase dependence on the suppliers of inputs, but it was claimed to increase inequality by hurting the small farmer—one intended beneficiary of the technology. The actual complexity of the outcomes is revealed by one of the most sophisticated assessments—modern seed varieties do reach small farmers, increase employment, and decrease food prices, but the benefits are less than expected because the poor are
Science and Deelopment increasingly landless workers or near landless farm laborers (Lipton and Longhurst 1989). What is important for the question of the relationship between science and development is that the products and practices of the Green Revolution were research-based technology. This technology was often developed in international research institutes funded by multilateral agencies such as the World Bank and bilateral donors such as the US Agency for International Development. Since the combined resources of these donors dwarf those of many poor countries, their developmental and research priorities constitute a broad global influence on the nature of science for development. The largest and most visible of these organizations form a global research network, the Consultative Group for International Agricultural Research (CGIAR) which grew from 4 to 13 centers during the 1970s as support by donors quadrupled. The influence of this network of donors and international agencies was clearly evident in the early 1990s when environmental concerns led to an emphasis on ‘sustainability’ issues. This led to a change in CGIAR priorities, as the older emphasis on agricultural productivity shifted to the relatively more complex issue of natural resource management. 2.2 Dependency Modernization theory emphasized internal factors while making an exception of science. Dependency theory and its close relative, world system theory, emphasized the role of external relationships in the developmental process. Relationships with developed countries and particularly with multinational corporations were viewed as barriers. Economic growth was controlled by forces outside the national economy. Dependency theory focused on individual nations, their role as suppliers of raw materials, cheap labor, and markets for expensive manufactured goods from industrialized countries. The unequal exchange relationship between developed and developing countries was viewed as contributing to poor economic growth. World system theory took a larger perspective, examining the wider network of relationships between the industrialized ‘core’ countries, impoverished ‘peripheral’ countries, and a group of ‘semiperipheral’ countries in order to show how some are disadvantaged by their position in the global system. Because of their overspecialization in a small number of commodities for export, the unchecked economic influence of external organizations, and political power wielded by local agents of capital, countries on the periphery of the global capitalist system continue to be characterized by high levels of economic inequality, low levels of democracy, and stunted economic growth. What is important about the dependency account is that science is not viewed in benign terms, but rather as one of a group of institutional processes that con-
tribute to underdevelopment. As indicated above, research is highly concentrated in industrialized countries. Dependency theory adds to this the notion that most research is also conducted for their benefit, with problems and technological applications selected to advance the interests of the core. The literature on technology transfer is also viewed in a different light. The development of new technology for profit is associated with the introduction and diffusion of manufactured products that are often unsuited to local needs and conditions, serving to draw scarce resources away from more important developmental projects. The condition of dependency renders technological choice moot. This concern with choice, associated with the argument that technology from abroad is often imposed on developing countries rather than selected by them, has resurfaced in many forms. In the 1970s it was behind the movement known as ‘intermediate’ technology, based on the work of E. F. Schumacher, which promoted the use of small-scale, labor-intensive technologies that were produced locally rather than of complex, imported, manufactured goods. These ‘appropriate’ technologies might be imported from abroad, but would be older, simpler, less mechanized, and designed with local needs in mind. What these viewpoints had in common was a critical approach to the adoption of technology from abroad. By the late 1980s and 1990s even more radical positions began to surface, viewing Western science as a mechanism of domination. These arguments were more closely related to ecological and feminist thought than to the Marxist orientation of dependency theory. Writers such as Vandana Shiva proposed that Western science was reductionist and patriarchal in orientation, leading to ‘epistemic violence’ through the separation of subject and object in the process of observation and experimentation (1991). ‘Indigenous knowledge’ and ‘non-Western science’ were proposed as holistic and sustainable alternatives to scientific institutions and knowledge claims. Such views had an organizational base in nongovernmental organizations (NGOs), which received an increasing share of development aid during this period, owing to donor distrust of repressive and authoritarian governments in developing areas. NGOs have been active supporters of local communities in health, community development, and women’s employment, even engaging in research in alternative agriculture (Farrington and Bebbington 1993). 2.3 Institutional Theory Institutional theory seeks to explain why nations are committed to scientific institutions as well as what forms these take. The central theme is that organizational structures developed in industrialized countries are viewed by policy makers, donors, and other states as signals of progress towards modern institutional 13609
Science and Deelopment development and hence worthy of financial support. Regardless of the positive or negative consequences of their activities, the introduction and maintenance of certain forms in tertiary education and government serves to communicate this commitment. Institutional theory provides an account of the growth and structure of the academic and state research sectors, as successful organizations in industrialized nations operate as models far from their original contexts. Academic departments consist of researchers grouped by subject, each of whom is relatively free to select research projects. They bear the closest resemblance to the root concept of science introduced at the beginning of this article. But research requires time and resources. In areas such as sub-Saharan Africa, laboratories and fieldwork are poorly funded, if at all, since many institutions can barely afford to pay salaries. Professors teach, consult, and often maintain other jobs. Research is conducted as a secondary activity and professional contacts with other scientists in Europe and the US are few. Equally important to the scientific establishment are state research institutes. These organizations are agencies of the state, they are charged with performing research with relevance to development, with health and agriculture the two most important content areas. They are linked to ministries, councils, and international agencies as well as systems (such as Extension Services in agriculture) that deliver technology to users—again based on a model from the developed world.
3. Relationships Between Science and Deelopment The popularity of dependency arguments and the resurgence of interest in indigenous forms of knowledge implies continued competition for older views of the uniformly positive effects of science. Institutional theory provides an alternative account of the spread of science and its organizational forms. But two features of current scholarship may prove more significant in the long run. First, extreme diversity exists among developing areas in terms of their economic, social, and cultural patterns. It makes decreasing sense to speak of ‘development’ as an area of study. Latin American nations, for example, are generally far better positioned than the nations of sub-Saharan Africa. There is even wide variation within countries, as the case of India makes clear. While much of India qualifies as a developing area, it is among the world’s top producers of scientific work, has a technically skilled, Englishspeaking labor force second only to the US, and is a leading exporter of computer software for corporations. Second, ‘science’ is viewed as having many dimensions, many effects, and fuzzy institutional boundaries, 13610
but it is always a feature of the modern, industrial, interconnected world. Science cannot be the cause of modernization because, in its diverse institutional articulations and its evolving fit with society, science exemplifies the meaning of modernization itself. See also: Biomedical Sciences and Technology: History and Sociology; Development: Social-anthropological Aspects; Development Theory in Geography; Innovation and Technological Change, Economics of; Research and Development in Organizations; Science and Technology, Anthropology of; Science and Technology: Internationalization; Science and Technology, Social Study of: Computers and Information Technology; Technology, Anthropology of
Bibliography Baber Z 1996 The Science of Empire: Scientific Knowledge, Ciilization, and Colonial Rule in India. State University of New York Press, Albany, NY Farrington J, Bebbington A 1993 Reluctant Partners: Nongoernmental Organisations, the State, and Sustainable Agricultural Deelopment. Routledge, London Gaillard J 1991 Scientists in the Third World. University of Kentucky Press, Lexington, KY Kloppenburg J 1988 First the Seed: The Political Economy of Plant Biotechnology. Cambridge University Press, Cambridge, UK Lipton M, Longhurst R 1989 New Seeds and Poor People. Johns Hopkins University Press, Baltimore, MD Pearse A 1980 Seeds of Plenty, Seeds of Want: Social and Economic Implications of the Green Reolution. Clarendon Press, Oxford, UK Schumacher E F 1973 Small is Beautiful: Economics as if People Mattered. Harper and Row, New York Shahidullah S 1991 Capacity Building in Science and Technology in the Third World. Westview Press, Boulder, CO Shiva V 1991 The Violence of the Green Reolution. Zed Books, London Shrum W, Shenhav Y 1995 Science and technology in less developed countries. In: Jasanoff S, Markle G, Peterson J, Pinch T (eds.) Handbook of Science and Technology Studies. Sage, Thousand Oaks, CA Stewart F 1977 Technology and Underdeelopment. Westview Press, Boulder, CO UNESCO 1998 World Science Report 1998. Elsevier, Paris Yearley S 1988 Science, Technology, and Social Change. Unwin Hyman, London
W. Shrum
Science and Industry The premodern industrial craft economy provided the initial intersection of industry with science through scientific instrument making. The development of scientific inquiry through craft-based production, and its effects, can be seen in Galileo and the telescope, chang-
Science and Industry ing the world-picture and the place of humans within it. From as early as Leeuwenhoek’s microscope to as late as the making of the Hubble telescope in the 1980s, lens and mirror construction was an uncertain art, not fully amenable to scientific analysis and control. The invention of moveable type transformed the transfer of knowledge, through storage and retrieval devices, and facilitated the development of modern scientific as well as popular literature. However, it was the development of the steam engine by a scientifically informed inventor, James Watt, which led to a systematization of practice that could be analyzed scientifically, becoming the basis for the next advance. This process gave rise to a new term, technology, to denote an innovation process as well as its results. Technology is the feedback link between science and industry. Invented in the eighteenth century, this new concept represented, on the one hand, the systematization of craft, through the creation of engineering disciplines, originally to quantify and sytematize the construction of military fortifications (Calvert 1967). On the other hand, technology was also derived from the extension of science. This occurred through the creation of applied sciences such as solid state physics and the invention of semiconductor devices such as the transistor. This article surveys the relationship between science and industry since the 1700s.
1. Early Science and Industrial Deelopment Science originated in the seventeenth century as organized investigation of the natural world according to relatively secure methodological principles. In this era, practical and theoretical concerns each provided between 40–60 percent of the impetus to research, with some overlap (Merton 1938). Well before science was institutionalized in universities and research institutes, individual scientists, loosely connected through scientific societies, provided an occasional basis for industrial development. For example, in the absence of reliable navigational techniques, commerce was impeded by the need for ships to stay close to shorelines. In response to a prize offered by the British government, astronomers used their observational techniques and knowledge base to develop star charts useful to navigators. Their involvement in solving a commercial problem was secured without incurring explicit costs. These were considered to be assumable by the astronomers themselves, with navigational research considered as an offshoot of their government or academic responsibilities that could be carried out at marginal cost. Clockmakers approached the problem from the opposite stance by adapting a mechanical device to the solution of the navigational problem. Of lower status and with lesser financial resources and institutional backing than the astronomers, a clockmaker who
arrived at a mechanical solution to the problem had great difficulty in being taken seriously by the judges of the competition. Moreover, as an independent craftsman he was dependent upon receiving intermediate financial dispensations from government to improve his device. Nevertheless, science and craft intersected to overcome blockages to the flow of trade by providing reliable navigational methods (Sobel 1996). Science became more directly involved in industrial production in seventeenth century Germany when professors of pharmacy invented medical preparations in the course of their experimentation. With support from some German princely states, and in collaboration with entrepreneurs, firms were formed to commercialize these pharmaceutical discoveries. Thus, the academic spin-off process, with governmental and industrial links, was adumbrated at this time (Gustin 1975).
2. Incremental Innoation: Learning by Doing In an era when most industry was craft based, incremental improvements arose primarily from workers’ experience with the process of production. For example, in the course of firing bricks a worker might notice that a brick had attained exceptional strength under high heat and then attempt, through trial and error, to duplicate what the observation of a chance event had brought to light. Eventually, the conditions that produced the original improved brick might be approximately reproduced and a sufficiently high rate of production achieved, with the ‘experimenter’ knowing more or less how but not really why the useful result had been achieved (Landes 1969). Scientific advance was also built upon what is now called ‘learning by doing’ (Lundvall and Borras 1997). By interviewing various practitioners, researchers began to collate and systematize their local knowledge into broader syntheses. Thus, advances in understanding of stratigraphy derived from miners’ practical experience in eighteenth-century Italy (Vaccari 2000). Much, if not most, innovation still takes place through craft-based experience. Indeed, scientific principles and methods such as those developed in operations research have recently been applied to systematize incremental innovation. Incremental innovation itself has been scientized through the application of statistical techniques, pioneered in Japan during the 1930s by the disciples of Edward Deming, the US researcher, who only later gained consulting opportunities and renown in his own country.
3. The Industrialization of Science The connection between academe, science, and industry strengthened with the invention of the chemistry laboratory as a joint teaching and research format 13611
Science and Industry by Justus Liebig at the University of Giessen in the mid-nineteenthth century (Brock 1997). Having achieved reliable analytical methods, Liebig could assign an unsolved problem to a student and expect a solution to arise in due course, with a minimum of supervision from the master or his assistants. The teaching laboratory model then spread to other experimental fields. As an organizational innovation combined with replicable methods of investigation, the laboratory allowed training and original investigation to be coordinated and expanded, creating a larger market for research equipment. Alexander Von Humboldt theorized the unity of teaching and research as an academic model as well as a practical tenet (Oleson and Voss 1979). The incorporation of science into the university along with methods to revive classical knowledge led to the development of research as an academic practice. The research university model was transferred from Germany to the US in the mid-nineteenth century and eventually became the base for technology transfer and firm formation (Jencks and Riesman 1968). Liebig, himself, attempted to take the development of technology from academic research a step further by starting businesses based upon his scientific discoveries. His mix of successes and failures foreshadowed the contemporary science-based firm. Nevertheless, this was an unusual combination of roles of researcher and entrepreneur in one person at the time (Jones 1993). More typically, the professor’s students made technology arising from scientific research into companies. Thus, the emblem of the Zeiss optical firm incorporates a portrait of the original founders, including professor and student.
4. The Foundation of Firms Based on Scientific Research With the invention of the laboratory, instrument making was internalized within science. As science and the need for research equipment grew, instrument production began to be externalized and made into an industry. Scientific instrument making is an early source of firm formation linked to academic research. For example, scientist-initiated instrument firms grew up around MIT and Harvard in the late nineteenth century, along with consulting firms such as A. D. Little providing a formal overlay of relationships between university and industry, beyond personal ties between teacher and former student. Until quite recently, scientific instrument making was a specialized niche industry, having more to do with academia than industry. A shift toward dual use industrial and scientific technologies began to occur with the development of electronic instrumentation. Oscilloscopes served as research tools, for example, to record nerve impulses in physiology, but were also utilized to provide quality assurance in industrial 13612
production. The formation of the Hewlett Packard and Varian Corporations, based upon innovations in electronics in the physics department at Stanford University in the 1930s, exemplify this transitional phase in the relationship between scientific instrumentation and industry (Lenoir 1997). Industrialized science is exemplified today by massproduced scientific equipment such as the sequencing machines crucial to the human genome project. What is new is the breakdown of the distinction between scientific instrumentation and the means of industrial production. This has occurred through the emergence of technologies such as computers that have broad application to science, business, art, and other techniques (Ellul 1964). Indeed, the computer is a machine with such protean implications that it became the basis of a science itself and contributed to the creation of a new class of sciences of the artificial.
5. The Rise of Science-based Industry Although it had its precursors in China, with gunpowder, paper making, and the organizational technology of bureaucracy, the science–industry interface has been rationalized in the West and North and transferred to the East and South in the modern era. Karl Marx, the initiator of the theory of sciencebased industry, even though he lost confidence and retreated to a labor theory of value, was far seeing. In the late nineteenth century, he had one example on which to base his thesis, the British chemist Perkin’s research on aniline-based dyes that was translated into an industry in Germany. According to a close observer, ‘Science and technology may have converged circa 1900’ (Wise 1983). The rise of the chemical and electrical industries in the late nineteenth century fulfilled some of the promise of science for industrial development. Nevertheless, even in these industries the translation of science into useful products often had an intervening phase based on ‘cut and try methods that only later became rationalized as in the unit method of scaling up chemical production (Servos 1980). Thomas Alva Edison, the inventor of the electric light, was also the inventor of the systematic production of inventions. His ‘idea factory,’ staffed by formally trained scientists and craftspersons, provided a support structure for the application of Edison’s basket of techniques to a series of technical problems whose solution was the basis for the creation of new industries and firms (Israel 1998). One of these companies, the General Electric Corporation (GE), took the connection between science and industry one step further in the US by hiring an academic scientist to organize a research laboratory for the firm. GE’s hiring of Willis Whitney brought the consulting function of academics to industrial problems within the firm and also created a new source of
Science and Industry product development, the corporate R&D laboratory (Wise 1983). A reverse relationship between science and engineering exists when corporations have ‘first mover’ rather than ‘follower’ business strategies. Siemens, for example, developed new business on the basis of advanced research in semiconductors during the early postwar period while its competitor, AEG, waited to see if a sufficient market developed before entering a field. Indeed, Siemen’s confidence in science was so great that it did not employ sufficient engineers to refine and lower the cost of its production (Serchinger 2000). The function of the corporate R&D lab was at least threefold: (a) maintain contact with the academic world and other external sources of information useful to the firm; (b) assist in solving problems in production processes and product development originating within the firm; and (c) originate new products within and even beyond the existing areas of activity of the firm. A few corporate labs, typically in ‘public’ industries such as telecommunications took on a fourth function as quasi-universities, contributing to the advance of science itself (Nelson 1962). Smaller firms typically focused on the first two functions, close to production, while larger firms more often spanned the entire range of activities (Reich 1987).
6. Uniersity–Industry–Goernment Relations Models for close public–private cooperation that were invented before World War II have recently been revived and expanded (Owens 1990). During the early postwar period, an institutional division of labor arose in R&D, with industry supporting applied research; government funding basic research in universities and research institutes; and universities providing knowledge and trained persons to industry (Reingold 1987). This ‘virtuous circle’ still accurately describes much of the university–industry–government relationship (Kevles 1977) Socialist countries developed a variant of this format, attempting to create closer connections to research and production through central planning mechanisms. However, by locating the R&D function in so-called branch institutes, socialist practice actually created a distance between research and production that was even greater than in capitalist firms that located their R&D units geographically apart from production while retaining organizational ties. Recently, socialist and, to some extent, large corporate formats for R&D have been decomposed and recombined in new formats. In the US, where these developments have taken their most advanced form to date, several innovations can be identified including: (a) Universities extending their functions from research and training into technology development and firm formation through establishment of new organ-
izational mechanisms such as centers, technology transfer offices, and incubator facilities. (b) The rise of start-up, high-tech firms, whether from universities, large corporate laboratories, or previous failed spin-offs, providing a new dynamic element in the science–industry relationship, both as specialized research organizations and as production units rooted in scientific advance. (c) A new role for government in encouraging collaboration among various academic and industrial units, large and small, in industrial innovation going beyond the provision of traditional R&D funding, including laws changing ‘the rules of the game’ and programs to encourage collaboration among firms and between universities and firms. An industrial penumbra appears around scientific institutions such as universities and research laboratories, creating feedback loops, as well as conflicts of interest and commitment between people involved in the two spheres. Over time, as conflicts are resolved, new hybrid forms of science-based industry and research, as well as new roles such as the industrial and entrepreneurial scientist, are institutionalized (Etzkowitz 2001).
7. Conclusion Economic development is increasingly based on science and technology at the local, regional, national, and multinational levels. The science–industry connection, formerly an ancillary and subsidiary aspect of both science and industry, now moves to the center of the stage of economic development strategy as the university and other knowledge-creating organizations become a source of new industry. Political entities at all of these levels develop policies and programs to enhance the science–industry interface, and especially to encourage high-tech firm formation (Etzkowitz et al. 2000). The ‘endless frontier’ model of ‘knowledge flows,’ transferred to industry through publications and graduates was insufficient to induce industrial innovation in most cases (Brown 1999). Closer connections were required. A series of organizational innovations to enhance technology transfer took place along a continuum: providing training in entrepreneurship, translation of research findings into intellectual property rights, and encouraging the early stages of firm formation. The potential for participation in knowledge-based economic development revivifies the discredited nineteenth century notion of progress. During the colonial era, technology transfer typically took place in a format that helped maintain political control, and the higher forms of tacit knowledge were kept secret by employing expatriate engineers. Nevertheless, as in India, a steel industry, created by local entrepreneurs, provided an economic base for a political indepen13613
Science and Industry dence movement. Moreover, a seemingly overexpanded higher educational system, originally put in place to train lower level bureaucrats and technicians, has become an engine of growth for India’s software industry. The synthesis of science and industry is universalized, as both industrial and natural heritages, such as machine-tool industry in German and biodiversity in Brazil, are integrated into new global configurations, often through local firms in strategic alliance with multinationals. Given the decreasing scale of scientific equipment and the increasing availability of higher education, countries and regions in virtually every part of the world can take advantage of opportunities to develop niches. A trend toward worldwide technological innovation has transformed the previous designations of third and second worlds into the ‘emerging world.’ New candidates for science-based economic growth bridge the expected gaps between ‘long waves’ of economic growth. Heretofore, the technology-based long waves identified by Freeman and his co-workers were made possible by a few key technologies; that is, information technology and biotechnology (Freeman and Soete 1997). The potential for growth from technologies such as solar photovoltaics, multimedia, and the Internet, and new materials arising from nanotechnology are so numerous that there is little or no technical reason, only political, for business cycles with declines as well as rises. See also: History of Science; History of Technology; Industrial Society\Post-industrial Society: History of the Concept; Industrial Sociology; Innovation: Organizational; National Innovation Systems; Reproductive Rights in Developing Nations; Science, Technology, and the Military; Scientific Revolution: History and Sociology
Bibliography Brock W 1997 Justus on Liebig: The Chemical Gatekeeper. Cambridge University Press, Cambridge, UK Brown C G 1999 Patent policy to fine tune commercialization of government sponsored university research. Science and Public Policy December: 403–14 Calvert M 1967 The Mechanical Engineer in America, 1830–1910: Professional Cultures in Conflict. Johns Hopkins University Press, Baltimore Carnavale A 1991 America and the New Economy. Jossey Bass, San Francisco Clow A 1960 The industrial background to John Dalton. In: Cardwell D S L (ed.) John Dalton and the Progress of Science. Manchester University Press, Manchester, UK Davis L 1984 The Corporate Alchemist: Profit Makers and Problem Makers in the Chemical Industry. William Morrow, New York Dyson F 1999 The Sun, the Genome and the Internet. Oxford University Press, New York
13614
Ellul J 1964 The Technological Society. Knopf, New York Etzkowitz H 2001 The Second Academic Reolution: MIT and the Rise of Entrepreneurial Science. Gordon and Breach, London Etzkowitz H, Gulbrandsen M, Levitt J 2000 Public Venture Capital. Harcourt, New York Freeman C, Soete L 1997 The Economics of Industrial Innoation. MIT Press, Cambridge, MA Gustin B 1975 The emergence of the German chemical profession. Ph.D. dissertation, University of Chicago Israel P 1998 Edison: A Life of Inention. Wiley, New York Jencks C, Riesman D 1968 The Academic Reolution. Doubleday, New York Johnson J 1990 The Kaiser’s Chemists: Science and Modernization in Imperial Germany. University of North Carolina Press, Chapel Hill, NC Jones P 1993 Justus Von Liebig, Eben Horsford and the development of the baking powder industry. Ambix 40: 65–74 Kevles D 1977 The NSF and the debate over postwar research policy, 1942–45. ISIS 68 Landes D 1969 The Unbound Prometheus. Cambridge University Press, Cambridge, UK Lenoir T 1997 Instituting Science: The Cultural Production of Scientific Disciplines. Stanford University Press, Stanford, CA Lundvall B-A, Borras S 1997 The Globalising Learning Economy. The European Commission, Brussels Merton R K 1938 Science, Technology and Society in Seenteenth Century England. St. Catherines Press, Bruges Nelson R 1962 The link between science and invention: The case of the transistor. In: The Rate and Direction of Inentie Actiity. Princeton University Press, Princeton, NJ, pp. 549–83 Oleson A, Voss J 1979 The Organization of Knowledge in Modern America. Johns Hopkins University Press, Baltimore Owens L 1990 MIT and the federal ‘Angel’ academic R&D and federal–private cooperation before World War II. ISIS 81: 181–213 Reich L 1987 Edison, Coolidge and Langmuir: Evolving approaches to American industrial research. Journal of Economic History 47: 341–51 Reingold N 1987 Vannever Bush’s new deal for research; or the triumph of the old order. Historical Studies in the Physical and Biological Sciences 17: 299–344 Serchinger R 2000 Wirstschaftswunder in Prezfeld, Upper Franconia; Interactions between science, technology and corporate strategies in Siemens semiconductor rectifier research & development, 1945–1956. History and Technology 335: 82 Servos J 1980 The industrial relations of science: Chemical engineering at MIT, 1900–1939. ISIS 71: 531–49 Sobel D 1996 Longitude. Penguin, Baltimore Vaccari E 2000 Mining and knowledge of the Earth in eighteenth century Italy. Annals of Science 57(2): 163–80 Wise G 1983 Ionists in industry: Physical chemistry at General Electric, 1900–1915. ISIS 74: 7–21
H. Etzkowitz
Science and Law The relationship between law and science has occupied scientists, philosophers, policymakers, and social analysts since the early modern period. Nature, according
Science and Law to science’s early modern practitioners, was governed by law-like regularities—God’s laws—and the work of science lay in discerning and revealing these laws of nature. In time, however, human law came to be seen as a fallible institution in need of rationalization, to be made more like a science in order to avoid arbitrariness. For early twentieth-century legal reformers, science provided a model of regularity for law to aspire to. At the same time, law and science were viewed by their practitioners as independent institutions, each with its own organization, practices, objectives, and ethos. Similarities and differences between the two fields were noted by many observers, with frequent commentary on conflicts between the law’s desire for justice and science’s commitment to the truth. By the end of the twentieth century, a new preoccupation with the law’s instrumental uses of science emerged, which led to divergent schools of thought about the epistemological status of scientific evidence and the ways in which legal and policy institutions should interact with scientific experts. The growth of the state’s regulatory powers increased governments’ dependence on scientific methods and information, and disputes developed about the extent to which science could reliably answer the questions put to it by legislators, regulators, and litigants. An influx of technically complex disputes caused judicial systems to reassess their handling of scientific and technical evidence. Analysts of these processes grappled with questions about the law’s capacity to render justice under conditions of scientific and social uncertainty, as well as the continued relevance of the lay jury and the role of legal proceedings in the production of scientific knowledge. This article reviews the resulting scholarship on the nature and consequences of the law–science relationship under four major headings. One concerns the role of these two institutions in shaping the authority structures of modernity, particularly in legitimating the exercise of power in democratic societies. A second relates to the law’s impact on the objectives, status, and professional practices of scientific disciplines relevant to the resolution of social problems. A third focuses on responses by courts and their critics to the challenges posed by experts in the legal system, including shifts in the rules governing expert testimony. Fourth and finally, an emerging line of research looks at the law as a site for generating scientific and technical knowledge and inquires into the implications of collaboration between law and science for individual and social justice.
1. Law and Science in the Transition to Modernity Since the origins of modern scientific thought, the term ‘law’ has been used to describe both the regularities discernible in nature and the rules laid down by
religious or secular authorities to guide human conduct. Both kinds of laws, it was once popularly assumed, embodied principles that should work always and everywhere in the same fashion, even though Robert Boyle, an early modern founder of experimental science, cautioned against too literal an assimilation of the behavior of matter under natural law to the behavior of human agents under civil law (Shapin 1994, pp. 330–1). As certain facts about human nature and behavior (such as the approximate equality of humans, their vulnerability, and the limits of their resources and altruism) came to be accepted as given, modern legal theorists adopted more synthetic views regarding the relationship between science and law. According to the renowned British scholar of jurisprudence H. L. A. Hart, for example, these natural facts underwrite a ‘natural law’ which human societies require in order to maintain their salient moral characteristics (Hart 1961). The American legal theorist Lon Fuller took a more sociological tack in his account of the ‘morality of law.’ Echoing Robert Merton’s well-known essay on the norms of science, Fuller argued that law and science resemble each other because each institution possesses an internal morality resulting from its distinctive arrangements, practices, and fiduciary demands (Fuller 1964). Later legal scholarship compared the ‘cultures’ of law and science and likewise called attention to their normative and procedural particularities (Goldberg 1994, Schuck 1993), but these institutional characteristics were now seen as a source of conflict, or culture clash. As elite institutions, science and law historically have drawn upon each other’s practices to build their authority. Thus, early modern experimentalists in the natural sciences enlisted the support of witnesses, both real and ‘virtual’ (Shapin and Schaffer 1985), as if they were proving a case in a court of law. The Enlightenment of the eighteenth century and the rise of liberal democracies brought science, now prized for its practical utility (Price 1985), into a more openly instrumental partnership with the law. In the USA, Thomas Jefferson drafted the first Patent Act of the fledgling American republic in 1793, thereby implementing the constitutional grant of power to Congress ‘to promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries.’ Where Jefferson saw the state as sponsoring science, some modern scholars have seen science’s open, self-regulating, nonhierarchical modes of governance as offering a model or prototype for democracy (Polanyi 1962). Others, however, regard the relationship of scientific and political authority as one of mutual dependency. Democratically constituted governments, they argue, need science and technology in order to answer constant public criticism and make positive demonstrations of their worth to citizens. States, therefore, have harnessed science and technology to instrumental projects in both war and peace. 13615
Science and Law In these legitimation exercises, the public attests to the state’s technical performance, much as scientists bore witness to each other’s observations in early modern experimental laboratories (Ezrahi 1990). The law’s procedural forms in this way have underwritten the credibility of both modern science and the state. The law’s own dependence on science has grown by distinctive pathways in the USA, as could be expected in a society where public decisions are exceptionally transparent and significant policy change commonly proceeds through litigation. By the beginning of the twentieth century, ideological and political shifts fostered the idea that social policies enacted by the state could and should be responsive to findings in the sciences. An icon of the Progressive era, US Supreme Court Justice Louis D. Brandeis, spurred this development through his historic ‘Brandeis brief’ in Muller . Oregon (1908). Written to defend the state’s 10-hour working-day restriction on women’s employment, the 100-page brief is widely regarded as the first attempt to bring social science insights into the courts. Brandeis, then a public interest lawyer in Boston, argued successfully that women were sufficiently different from men to justify special legislative protection, thereby distinguishing the Oregon case from the Court’s earlier, infamous decision in Lochner . New York (1905), which struck down a similar state law curtailing working hours for (male) bakers. Social science evidence received a further boost when, in overturning the ‘separate but equal’ doctrine that was the basis for racial segregation in public schools, the Supreme Court in Brown . Board of Education (1954) relied on psychological studies of segregation’s ‘detrimental effect upon the colored children.’ In later years, students with disabilities would benefit from similar sorts of evidence in claiming rights to special education (Minow 1990). Employment discrimination, racial segregation, antitrust, capital punishment, and environmental liability were among the many litigated subjects that brought social science testimony into the US courts in the latter decades of the twentieth century. These cases kept alive the view that objective scientific information could be invoked in legal proceedings to combat various forms of oppression and social injustice. If science was allied with the search for justice in US courts, its role in policymaking was primarily as the voice of reason. Government’s increasing involvement in welfare policy in the New Deal, intensified by the rise of social regulation in the 1970s, created new demands for rational decisions concerning a host of costly and potentially explosive problems, such as race and the environment. As can be observed in the context of environmental regulation (Jasanoff 1990), federal agencies proved increasingly less able to take refuge in generalized claims of expertise. Much work went into separating science from politics and creating new institutions to deliver putatively independent advice to government, although episodes like the 1995 13616
dissolution of the Office of Technology Assessment, an advisory body to the US Congress, called attention to the fragility of such efforts. New types of scientific expertise also evolved to support governmental decisions with substantial distributive impacts, as discussed in the next section. The relationship between law and science in modern European states was in some ways more covert than in the USA, but nonetheless pervasive. The crises of industrialization in the nineteenth century and the ensuing rise of the welfare state ushered in a demand for credible public action that states could not satisfy without rapid increases in their capacity to order society through newly accredited social science disciplines (Wagner et al. 1991). In turn, the classifications offered by the human sciences in particular supplied the basis for new forms of self-understanding, normalization, and social control. Whether in medicine, psychology, criminology or studies of intelligence, science offered the means for dividing people into categories of normal and pathological, mainstream and deviant. The scientific study of human characteristics thus enabled members of society to see and discipline each other without the need for constant, formal regulatory oversight by the state (Foucault 1990). Science, on this account, not only provided supports for legal authority but actually replaced some of the ordering functions of positive law. European institutions at first resisted the pressures that fed attempts to separate science from politics in US legal contexts. Expert advisory bodies were routinely constituted to include political as well as technical expertise, although judgments about the need for lay or nonprofessional expertise varied from country to country. European courts, for their part, dealt with more focused and on the whole less politically salient disputes than in the USA, so that there were fewer reasons to contest expert judgments. By the 1990s, however, a number of developments tried public confidence in European governments’ uses of science, chief among them the growing political influence of the European Union’s technical bureaucracies and the fallout from the UK government’s botched response to the bovine spongiform encephalopathy (BSE) crisis, which cast doubt on the integrity and impartiality of experts. European lawmakers, regulators, and courts began reconsidering the adequacy of their institutional arrangements for sciencebased decision-making, as the issue of science and governance rose to new prominence on political agendas.
2. Disciplines, Professions, and the Law The interactivity of law and science has spurred new disciplinary formations and helped to change professional norms and practices, especially in the human sciences, which serve both as allies in regulation and
Science and Law law enforcement and as uneasy subjects of the law’s skeptical gaze. This dual relationship with the law has prompted novel lines of scientific research and theorizing, accompanied by periodic legal encroachment upon science’s professional autonomy. Developments in the fields of risk analysis and psychology clearly illustrate these patterns. The scientific study of risk is an outgrowth of state efforts to manage the hazards of industrial production and an extension of modern governments’ growing reliance on formal decision-making tools to rationalize difficult political decisions. Uncertainty about technology’s risks and benefits rose through the twentieth century along with the size and complexity of systems of production. From village-based economies, in which injured parties presumed they knew the agents as well as the causes of harms inflicted on them, the global economy pulled consumers into webs of relationships that diffused both the knowledge of risks and the responsibility for regulation and compensation. Epitomizing the plight of industrialization’s innocent victims, millions of asbestos workers in mines, shipyards, and other industries were unknowingly exposed to severe health hazards, and hundreds of thousands died of consequent illnesses. Resulting lawsuits financially crippled asbestos manufacturers without adequately recompensing injured workers. A 1984 accident at a US subsidiary’s chemical plant in Bhopal, India, again produced hundreds of thousands of victims, with massive scientific and legal uncertainties clouding the settlement of claims. As the pervasiveness of industrial hazards became apparent, legislatures sought to ensure that regulation would not wait for proof in the form of ‘body counts’: risk, not harm, became the basis for action under new laws safeguarding health, safety, and the environment, and state agencies struggled to develop credible processes for assessing and mitigating risks before anyone was injured. The sciences, for their part, responded with attempts to mobilize existing knowledge and methods to provide more useful tools for analyzing risks and crafting preventive strategies. From relatively modest beginnings, rooted largely in the engineering and physical sciences (as in assessing the risks of meltdown in nuclear power plants), risk assessment in support of regulatory standards grew into an important, and increasingly contested, branch of the social sciences. Risk studies acquired the indicia of professionalization, with dedicated journals, societies, curricula, and centers at major universities. One focus of research and development in this field, particularly in the USA, was the construction of reliable models for measuring risk, from the use of animal species as surrogates for humans to sophisticated methods of representing uncertainty in risk estates (National Research Council 1994). Risk assessment presented not only methodological difficulties for scientists and engineers, but also institutional challenges for governments needing to persuade industry
and the public that their decisions were scientifically sound and politically balanced (Jasanoff 1990). Lawsuits questioning the risk assessment practices of federal agencies were not uncommon and led to several significant judicial decisions, including Supreme Court rulings on the validity of the occupational standard for benzene in 1980 and the air quality standard for ozone and particulates in 2001. Other researchers meanwhile approached risk from the standpoint of social psychology, asking why lay observers frequently diverged from experts in their judgments concerning the relative significance of different types of risk. This question engendered a sizeable body of experimental and survey work on risk perception (Slovic 2000), as well as theoretical critiques of such work, showing problems in both the framing and characterization of lay attitudes (Irwin and Wynne 1994). While some social scientists occupied themselves with identifying and measuring ‘real’ risks, or perceptions of them, others treated risk itself as a new organizing principle in social life, noting that vulnerability to risk did not map neatly onto earlier divisions of society according to race, class or economic status (Beck 1992). By contrast, the US environmental justice movement generated data suggesting that social inequality still mattered in the distribution of industrial risks, which fell disproportionately on poor and minority populations (Bullard 1990). This strategic, though controversial, research led in 1994 to a presidential executive order demanding that US federal agencies should consider the equity implications of their actions; equity analysis emerged in this way as an offshoot of existing methodologies for assessing risk and offered formal justification for legal claims of environmental injustice. Unlike risk analysis, which is a product of twentiethcentury concerns, law and the mental health disciplines have existed in a relationship of close reciprocity at least since 1843, when the British House of Lords laid down the so-called M’Naghten rule for determining whether a criminal defendant is entitled to relief on grounds of insanity. The rule holds that a person may be released from criminal responsibility for actions taken while he or she was unable to distinguish between right and wrong as a result of mental disease or defect. More recently, the human sciences’ ability to sort people’s behavior and capacities into seemingly objective, ‘natural’ categories has served the needs of both democratic and totalitarian states in a growing variety of legal settings. The need to determine people’s mental states in civil as well as criminal proceedings was an important driver of the legal system’s enrolment of professionals from these fields. In US law, for example, psychiatric evidence was used to determine whether a criminal accused could stand trial, whether conditions for sentencing leniency or parole were met, and whether a convicted criminal posed a sufficient threat of long-term dangerousness to 13617
Science and Law deserve the death penalty. While these intersections provided new professional opportunities for psychiatry and psychology, spawning a thriving pedagogic specialty of law and mental health (Gutheil and Appelbaum 2000), entanglements with the law also disclosed frequent divisions among experts and undermined disciplinary authority in other ways. The most visible challenge to professional autonomy occurred in the 1974 US case of Tarasoff . Regents of the Uniersity of California (redecided in 1976), in which the California Supreme Court ruled that psychiatrists were required to warn (later changed to a duty to protect) third parties threatened by their patients. Although the decision did not open the floodgates to liability as many therapists had feared, Tarasoff was widely seen as influencing therapeutic practice, often in counterproductive ways. In another notable decision affecting the mental health field, the US Supreme Court in Barefoot . Estelle (1983) essentially ignored an amicus brief by the American Psychiatric Association stating that psychiatric predictions of dangerousness are wrong in as many as two out of three cases. The Court expressed confidence in juries’ capacity to evaluate expert evidence even if it is weak or deficient. By the end of the twentieth century, the alliance between law and the human sciences—joined, eventually, by the nascent science of genomics—helped to define a host of novel disorders and syndromes that were invoked to extenuate or penalize various types of conduct. Post-traumatic stress disorder (PTSD), for example, along with its specific manifestations such as rape trauma syndrome, entered the DSM-IV, the official diagnostic handbook for psychologists. So codified, PTSD became not only an identifiable and presumably treatable medical condition, but also a basis for asserting new legal claims (as in lawsuits for stress-related psychological injury in the workplace) or defending oneself against them (as in murder trials of sexually abused defendants). Possibly the most notorious example of synergy between law and psychology occurred in a rash of so-called recovered memory and child abuse cases, which became especially prevalent in the US in the 1980s (Hacking 1995). Initially gaining credibility from psychiatric experts, who testified to the reliability of recovered childhood memories, these cases caused scandal as witnesses recanted and it became apparent that some experts had more likely instilled than elicited the shocking ‘memories.’ The profession eventually organized to deny the scientific validity of the claims and repudiate the expertise of the claimants (Loftus and Ketcham 1994). Research on eyewitness identification, another branch of research dealing with memory and recall, also received its chief impetus from the legal process. Prompted by insupportably high error rates on the part of eyewitnesses, psychologists tested people’s ability to recognize persons they had encountered in 13618
stressful situations and discovered that many factors, such as race, inhibit accurate identification under these circumstances (Wells 1988). These findings began to have an impact on police procedure by the late 1990s as awareness dawned that relatively simple changes, such as sequential rather than simultaneous presentation of suspects in line-ups, could substantially improve the reliability of eyewitness testimony.
3. Scientific Eidence and Expert Witnesses Whereas civil law systems relegated the production of scientific and technical evidence largely to national forensics labs and court-appointed experts, common law proceedings traditionally left it to the parties themselves to produce evidentiary support for their claims. The adversary process, with its right of crossexamination, was assumed to be equal to the task of testing the evidence, no matter how arcane or technical, thereby permitting the fact-finder—the judge or jury—to ascertain the truth. Although commentators sometimes deplored the practice of treating experts as hired guns, the parties’ basic right to present their evidence was not in doubt, and courts rarely used their legally recognized power to appoint independent experts who might offer a more disinterested view (Jasanoff 1995). By the turn of the century, several factors combined to challenge this hands-off attitude. The sheer volume of cases demanding some degree of technical analysis was one contributing cause, especially in the US, where low entry barriers to litigation and the inadequacy of social safety nets routinely brought into the courts controversies that were resolved by other means in most common law jurisdictions. Highly publicized instances of expertise gone awry, as in the recovered memory cases, shook people’s confidence in the power of cross-examination to keep out charlatans presenting themselves as scientists. The economic consequences of successful products liability lawsuits strained the system from yet another direction, especially in a growing number of mass tort actions that threatened industries with bankruptcy. It was tempting to blame many of these developments on the perceived weaknesses of the adversary system: the passivity and low scientific acumen of judges, the technical illiteracy of juries, the gamesmanship of lawyers, the bias or incompetence of experts selected by parties. These deficits led to calls for reforming the process by which expert testimony was admitted into the courtroom. The initial round of critique marked a turn away from the liberal ideals of the 1970s and proved to be politically powerful, although it was neither methodologically rigorous nor of lasting scholarly value. An early broadside accused courts of deciding cases on the basis of ‘junk science,’ a rhetorically useful concept that took hold with opponents of the burgeoning tort
Science and Law system but proved difficult to characterize systematically (Huber 1991). The barely concealed program of this and related writings was to remove as much as possible of the testing of scientific evidence from the adversary system, particularly from the jury’s purview, and to decide these issues either in judicially managed pretrial proceedings or with the aid of scientists appointed by the courts. Undergirding this literature was an almost unshakable faith in science’s selfcorrective ability and a technocratic conviction that truth would eventually win out if only the legal system left scientific fact-finding to scientists (Foster and Huber 1997). These attacks also tacitly presumed that mainstream expert opinion could be identified on any issue relevant to the resolution of legal disputes. These positions would later be thrown into doubt, but not until the spate of polemical work had left its mark on American scientific and social thought. Studies by practicing scientists, physicians and some legal academics reiterated and amplified the theme of the law’s misuse of science. Critics cited as evidence a number of tort cases in which multimillion dollar jury verdicts favorable to plaintiffs conflicted with the opinions of respected scientists who denied any causal connection between the alleged harmful agent, such as a drug or workplace toxin, and the harm suffered. Particularly noteworthy was the litigation involving Bendectin, a medication prescribed to pregnant women for morning sickness and later suspected of causing birth defects in their children. Juries frequently awarded damages despite assertions by epidemiologists that there was no statistically significant evidence linking Bendectin to the claimed injuries. Research on the role of experts in these cases showed suggestive differences in behavior (such as higher rates of repeat witnessing) between plaintiffs’ and defendants’ experts (Sanders 1998), leading some to question whether judges were adequately screening offers of scientific testimony. Another episode that drew considerable critical commentary was litigation involving silicone gel breast implants. Close to a half-million women surgically implanted with these devices sued the manufacturer, Dow Corning, claiming injuries that ranged from minor discomfort to permanent immune system damage. The company’s initial settlement offer broke down under the onslaught of lawsuits, but epidemiological studies, undertaken only after the commencement of legal action, indicated no causal connection between the implants and immune system disorders. Publication of these results in the prominent New England Journal of Medicine led its executive editor to join the chorus of accusation against the legal system’s apparent misuse of science (Angell 1996). Rising discontent about the quality of scientific evidence led the US Supreme Court to address the issue for the first time in 1993. In Daubert . Merrell Dow Pharmaceuticals, the Court set aside the so-called Frye rule that had governed the admissibility of expert testimony for the preceding 70 years. Instead of simply
demanding that scientific evidence should be ‘generally accepted’ within the relevant peer community, the Court urged judges to screen science in pretrial proceedings, in accordance with standards that scientists themselves would use. The Court offered four sample criteria: (a) was the evidence based on a testable theory or technique, and had it been tested; (b) had it been peer reviewed; (c) did it have a known error rate; and (d) was the underlying science generally accepted? Two more major evidence decisions of the 1990s cemented the high court’s message that trial judges should play a vastly more proactive gate-keeping role when confronted by scientific and technical evidence. As judicial involvement on this front increased, new complaints appeared on the horizon: that judges were using the Daubert criteria as an inappropriately inflexible checklist rather than as the guidelines they were meant to be; that mechanical application of Daubert and its progeny was trumping deserving claims; that judges were usurping the jury’s constitutional role; and that misinterpretation of Daubert was introducing unscientific biases and impermissibly raising the burden of proof in civil cases. In most civil law jurisdictions, by contrast, the inquisitorial approach to testing evidence, coupled with the state’s near monopoly in generating forensic science, precluded much controversy about the legitimacy of experts or the quality of their testimony. Tremors could be detected, however, arising from such episodes as the discovery of substandard or fabricated evidence in British trials of Irish terrorists and child abuse suspects and the French government’s cover-up of the contamination of the blood supply with the HIV-AIDS virus. Ironically, as US courts moved to consolidate their gate-keeping function in the 1990s, constricting the entry routes for experts, a countermove could be discerned in some European legal systems to open the courts, and policy processes more broadly, to a wider diversity of expert opinion (Van Kampen 1998).
4. Sites of Knowledge-making Investigation of the law-science relationship took a considerably more sophisticated turn in the 1990s as it attracted the attention of a new generation of researchers in science and technology studies. From this disciplinary vantage point, the use–misuse framing that dominated popular writing appeared superficial in the extreme when set against the observed richness of interactions between the two institutions. Rather, as this body of work began to document, it was the hybridization of law and science that held greatest interest and significance for science and society. At this active frontier of social problem solving, one could observe in fine-grained detail how important normative and epistemological commitments either reinforced or contradicted each other in contemporary 13619
Science and Law societies. For example, even within the common law’s adversarial environment, the ritualistic character of legal proceedings successfully held the line against certain forms of radical skepticism, leaving untouched traditional hierarchies of expert authority (Wynne 1982), a belief in the ultimate accessibility of the truth (Smith and Wynne 1989), and an abiding faith in technological progress (Jasanoff 1995). At the same time, the legal process increasingly came to be recognized as a distinctive site for the production of otherwise unavailable knowledge and expertise. Even in cases where litigation arguably produced anomalous results—such as Bendectin and breast implants—resort to the courts demonstrably added to society’s aggregate store of information. Other cases were unambiguously progressive. A particularly striking example was the importation of molecular biological techniques into the arenas of criminal identification and paternity testing in the form of ‘DNA typing.’ Not only did the legal system’s needs provide the initial context for the technique’s development, but subsequent contestation over the reliability of DNA-based identification had impacts on the field of population genetics and spurred the formation of new testing methods and agencies, as well as the standardization of commonly used tests (Lynch and Jasanoff 1998). As the technique became black-boxed, new market niches for it emerged within the legal process, most spectacularly as a means of exonerating mistakenly convicted persons. The success of DNA as a forensic tool thus helped to destabilize the legal system’s longstanding commitment to eyewitness evidence, and even its reliance on the seemingly unassailable authority of fingerprints (Cole 2001), thereby facilitating critiques of a possibly antiquated privileging of visual memory (Wells 1988). But the massive uptake of forensic DNA analysis by the criminal justice system also opened up new areas of social concern, such as privacy and database protection, which called for continued vigilance and ingenuity on the part of the legal system. The rise of DNA typing, like that of risk analysis and post-traumatic stress disorder, captured the dynamics of an epoch in which the establishment of any form of social order became virtually unthinkable without the mutual engagement of law and science. In such a time, commentary deploring the law’s alleged misuse of science appeared increasingly reductionist and out of touch with everyday reality. The emergence of science and law as a special topic within science and technology studies responded to the inadequacies of earlier analyses and showed greater intellectual promise. A major contribution of this work was to abandon once for all the pared down ‘clashing cultures’ model of the law–science relationship and to put in its place a more nuanced picture of the positive interplay between normative and cognitive authority. By exploring in detail how the law not only ‘uses’ science, but also questions it and often compensates for its deficien13620
cies, the new scholarship on science and law invited reflection on the intricate balancing of truth and justice in technologically advanced societies. See also: Biotechnology; Disciplines, History of, in the Social Sciences; Expert Testimony; Expert Witness and the Legal System: Psychological Aspects; Intellectual Property, Concepts of; Intellectual Property: Legal Aspects; Law: History of its Relation to the Social Sciences; Legal Process and Social Science: United States; Norms in Science; Power in Society; Professions, Sociology of; Science and Technology Studies: Experts and Expertise; Science and the State; Science, Economics of; Truth and Credibility: Science and the Social Study of Science
Bibliography Angell M 1996 Science on Trial: The Clash of Medical Eidence and the Law in the Breast Implant Case. Norton, New York Beck U 1992 Risk Society: Towards a New Modernity. Sage, London Bullard R D 1990 Dumping in Dixie: Race, Class, and Enironmental Quality. Westview, Boulder, CO Cole S 2001 Manufacturing Identity: A History of Criminal Identification Techniques from Photography Through Fingerprinting. Harvard University Press, Cambridge, MA Ezrahi Y 1990 The Descent of Icarus: Science and the Transformation of Contemporary Democracy. Harvard University Press, Cambridge, MA Foster K R, Huber P W 1997 Judging Science: Scientific Knowledge and the Federal Courts. MIT Press, Cambridge, MA Foucault M 1990 The History of Sexuality. Vintage, New York Fuller L 1964 The Morality of Law. Yale University Press, New Haven, CT Goldberg S 1994 Culture Clash. New York University Press, New York Gutheil T G, Appelbaum P S 2000 Clinical Handbook of Psychiatry and the Law, 3rd edn. Lippincott Williams & Wilkins, Philadelphia, PA Hacking I 1995 Rewriting the Soul. Princeton University Press, Princeton, NJ Hart H L A 1961 The Concept of Law. Oxford University Press, Oxford, UK Huber P 1991 Galileo’s Reenge: Junk Science in the Courtroom. Basic Books, New York Irwin A, Wynne B (eds.) 1994 Misunderstanding Science. Cambridge University Press, Cambridge, UK Jasanoff S 1990 The Fifth Branch: Science Adisers as Policymakers. Harvard University Press, Cambridge, MA Jasanoff S 1995 Science at the Bar: Law, Science and Technology in America. Harvard University Press, Cambridge, MA Loftus E F, Ketcham K 1994 The Myth of Repressed Memory: False Memories and Allegations of Sexual Abuse. St. Martin’s Press, New York Lynch M, Jasanoff S (eds.) 1998 Contested identities: Science, law and forensic practice. Social Studies of Science 28: 5–6
Science and Religion Minow M 1990 Making all the Difference: Inclusion, Exclusion, and American Law. Cornell University Press, Ithaca, NY National Research Council (NRC) 1994 Science and Judgment in Risk Assessment. National Academy Press, Washington, DC Polanyi M 1962 The republic of science. Minera 1: 54–73 Price D K 1985 America’s Unwritten Constitution: Science, Religion, and Political Responsibility. Harvard University Press, Cambridge, MA Sanders J 1998 Bendectin on Trial: A Study of Mass Tort Litigation. University of Michigan Press, Ann Arbor, MI Schuck P 1993 Multi-culturalism redux: Science, law, and politics. Yale Law and Policy Reiew 11: 1–46 Shapin S 1994 A Social History of Truth. University of Chicago Press, Chicago Shapin S, Schaffer S 1985 Leiathan and the Air-pump: Hobbes, Boyle, and the Experimental Life. Princeton University Press, Princeton, NJ Slovic P 2000 The Perception of Risk. Earthscan, London Smith R, Wynne B (eds.) 1989 Expert Eidence: Interpreting Science in the Law. Routledge, London Van Kampen P T C 1998 Expert Eidence Compared: Rules and Practices in the Dutch and American Criminal Justice System. Intersentia, Antwerp Wagner P, Wittrock B, Whitley R (eds.) 1991 Discourses on Society: The Shaping of the Social Science Disciplines. Kluwer, Dordrecht, The Netherlands Wells G L 1988 Eyewitness Identification: A System Handbook. Carswell, Toronto Wynne B 1982 Rationality and Ritual: The Windscale Inquiry and Nuclear Decisions in Britain. British Society for the History of Science, Chalfont St. Giles, UK
S. Jasanoff
Science and Religion Understanding the present relationship between science and religion requires a recognition of broad historical trends. The subject has been most commonly treated in depth by historians of science, though scholars in both pure science and theology have also been active. Several posts in the UK, including a professorship at Oxford, and at least three major journals are devoted to the subject of science and religion (Zygon, Perspecties on Science and Christian Faith, and Science and Christian Belief ), and numerous organizations have been formed to promote its study both in the US and UK. Scholarship has become increasingly sophisticated, and frequent attacks on earlier ‘simplistic’ views have had the effect of excluding from the debate a great many ordinary people. This is quite unnecessary, though it is true that both ‘science’ and ‘religion’ are not eternally unchanging entities and do vary over time and space. In what follows, science is simply the study and systematic observation of the natural world, while religion may be seen generally as humanity’s quest after God, a quest taking a variety of different
forms. The subject is, of course, an emotive issue for some, from vociferous antagonists of religion (like the geneticist Richard Dawkins) to equally uncompromising supporters (such as the American ‘creationists’). In the following account the main emphasis will be on Christianity, not for partisan reasons but because, historically, that religion more than any other has related closely to the emerging sciences.
1. The Social Relations of Science and Christianity This topic has become the subject of a number of wellknown statements or theses, five of which will now be considered. Some are about specific periods in history (as the seventeenth century) while others relate to allegedly more timeless issues. Some bear well-known names of their chief proponents.
1.1 The Merton Thesis In 1938, the American sociologist Robert Merton argued that the emergence of science in the seventeenth century was promoted by the ascendancy of Protestant religion. Since then this contentious position has been revisited again and again, and it is hard to resist the conclusion that, in modified form, the thesis has more than an element of truth. Certainly, there is a correlation between visibility in science and religious allegiance. In the French AcadeT mie des Sciences from 1666 to 1885, the ratio of Protestants to Catholics was 80: 18, despite the preponderance of Catholics in the country. Others have shown a similarly high proportion of Puritans (not merely Protestants) in the membership of the early Royal Society in England, and Merton argued that Puritan attitudes did much to encourage its growth. There have been objections to such ‘counting of heads,’ not least because of the difficulty of defining a Puritan. If such a person is considered theologically rather than politically some of the difficulties disappear. On this basis a Puritan is one who holds strongly to the teaching of the Bible as opposed to the church or tradition, not necessarily a supporter of the Parliamentary cause in revolutionary England. Yet it may still be argued that such correlations do not prove that Puritan theology encouraged science, for may not both have emerged from a common cause? Could not both have been an expression of new movements of social and economic change and of a libertarian philosophy? This may well be true, but a general correlation in the 1600s between promotion of science and a generally Protestant loyalty to the Bible seems inescapable. 13621
Science and Religion 1.2 The Hooykaas Thesis This goes further than the Merton thesis might suggest, and argues that the origins of modern science do not merely correlate neatly with Protestant beliefs but are directly derived from them. It has been implied by many authors since the 1970s but is particularly associated with the Dutch historian of science Reijer Hooykaas (1906–94) in his epochal Religion and the Rise of Modern Science. To make such an assertion is to stand traditional views on their head. It was once customary to see science as a truly Greek inheritance, at last freed from religious shackles at the Renaissance. Hooykaas invites us to view it rather as a product of Biblical theology, freshly released at the Reformation, and subverting much of Greek philosophy which had for 1500 years inhibited any real rise of experimental science. Serious evidence exists in support of this thesis. There are explicit declarations by many well-known scientific figures from Francis Bacon to Isaac Newton and beyond that their science was inspired theologically. Then there is a remarkable congruity between Biblical and scientific credos. At least five points of intersection can be identified. (a) Most profound of all, perhaps, was what Hooykaas called the ‘demythologization of nature,’ so that nature was no longer to be seen as divine or even animate, but more like a machine than anything else. It is not God but a creation by him, and therefore a proper object of study and experiment. Such was the theology of the Old and New Testaments, and it was eloquently proclaimed by statements from ‘the father of chemistry’ Robert Boyle and many others. (b) Moreover, the idea that nature worked by laws is at the very foundation of science, and is also Biblical. Amongst those who wrote of laws impressed by God on nature were Descartes, Boyle, and Newton. Men looked for laws when they recognized the law-giver. (c) But science can only discover these laws by experimentation. Manipulation of nature had been regarded by many Greeks as socially unacceptable (except for slaves) or even impious. Among the most urgent advocates of an experimental method, based on widespread Biblical approval of manual techniques for testing, was Francis Bacon who urged ‘men to sell their books, and to build furnaces.’ (d) Then again a further religious impetus to science came from the Biblical exhortations to see the heavens and earth as manifesting the glory of their Creator. The astronomer Kepler, it is said, asserted that in his researches he ‘was thinking God’s thoughts after him.’ This religious motivation became a strong emphasis in Calvinistic theology. (e) Finally, the Biblical mandate for humanity to exert ‘dominion’ over nature opened up possibilities of scientific work ‘for the glory of God and the relief of man’s estate’ (Bacon). As Hooykaas wrote: The Biblical conception of nature liberated man from the naturalistic bonds of Greek religiosity and philosophy and
13622
gave a religious sanction to the development of technology [Hooykaas, Religion and the Rise of Modern Science, 1973, p. 67].
One cannot dismiss the burgeoning literature linking science to Protestantism in the seventeenth century as mere rhetoric. But it can be argued that Hooykaas underestimates the numbers of Roman Catholics who were eminent in science, though even here caution is needed. The oft-quoted counter example of the Catholic Copernicus must not obscure the astronomer’s indebtedness to both his Lutheran assistant Rheticus, and the liberal legacies of Erasmus within his own church. Attitudes towards nature varied widely within Catholicism, and there is always the danger of generalising too widely. But in its essence the Hooykaas thesis appears to be substantially correct.
1.3 The Lynn White Thesis If Christian theology has been one of the formative influences on modern science it does not follow that this has always been in the best interests of the world and its inhabitants. That opinion has been forcibly expressed in a further thesis, first proposed by the American historian Lynn White in 1966\7. He argued that much of the damage to our environment springs from a misuse of science and technology for which ‘Christianity bears a huge burden of guilt.’ More specifically he locates the problem in the ‘realisation of the Christian dogma of man’s transcendence of, and rightful mastery over, nature’ (White 1967). He urges a return not to the primitive Christianity of the New Testament but to the animistic world of St Francis of Assisi which saw the earth and its inhabitants as brothers rather than instruments. In White’s view the Biblical call to ‘dominion’ must have been understood in terms of exploitation. However, careful historical examination of the evidence suggests that few, if any, pioneers in science or theology held this particular view. John Calvin explicitly repudiated it, as did the noted eighteenth century writer William Derham. They, and many others, urged an interpretation of dominion as ‘responsible stewardship.’ In more recent times ideologically driven human conquest of nature has been on the Marxist rather than the Christian agendas. It is now widely acknowledged that, as a historical generalization, the Lynn White thesis stands largely discredited.
1.4 The Conflict Thesis This thesis is much older than the others we have examined, and much better known. The thesis states that science and religion have been in perpetual conflict and that, eventually, science will vanquish
Science and Religion religion. It was most notoriously promulgated by the Victorian naturalist T. H. Huxley, but has been advocated by many others in the late nineteenth century and is probably best not attributed to any one individual. Two books arguing the point are History of the Conflict between Religion and Science by J. W. Draper (first published in 1875), and A History of the Warfare of Science with Theology in Christendom by A. D. White (1895). They achieved enormous circulation and are still in print. The essence of the argument is that where scientific conclusions have been challenged by the church there has usually been an eventual retreat by the ecclesiastical authorities. Two classic cases were those of Galileo in the seventeenth century and of Darwin nearly 250 years later. It was in the aftermath of the latter imbroglio that the conflict thesis was formulated. Many more examples were unearthed by Draper and White and an impressive case assembled (especially by the latter). Manifestly, it is hard to reconcile this thesis with the three previously considered. If science owed so much to religion how could they possibly be in conflict? The thesis flies in the face of massive evidence that points to a powerful alliance between science and Christianity since at least the seventeenth century. A noteworthy feature of the conflict thesis is that it was first proposed at a time when history of science hardly existed and historical scholarship had certainly not explored the nuances perceived by Merton, Hooykaas, and their colleagues. Detailed examination of the relatively minor cases urged by Draper and White reveals that many were badly documented, some plainly apocryphal and others greatly exaggerated. It seems that the authors sometimes saw what they were looking for and built their generalizations on a slender foundation. To be sure, Galileo and Darwin were assailed by organized religion but they seem to have been relatively isolated exceptions to the general rule that Christianity and science coexisted in harmony. Partial explanations for the persecution of Galileo lay in power struggles within the church, while Darwin’s problems arose in part from a deeply divided society in industrial Britain. Today the books of Draper and White are rated not as serious works of historical scholarship but as highly polemical tracts reflecting the tensions existing in the social and cultural environment in which the authors lived. Draper wrote as a man deeply disenchanted with the Roman Catholic church, not least for its recent declaration of papal infallibility. White, on the other hand, was President of one of the first explicitly nonsectarian colleges in the US (Cornell) and had suffered badly at the hands of the religious establishment. Each had his own axe to grind in taunting the organized church. The result was not history but myth. Yet a conflict thesis is commonly held in the world today, so one needs to asks how such a tendentious
myth could have become so entrenched in Western culture. To understand the reason it is necessary to recall the plight of English science in the last 40 years of Victoria’s reign. Underfunded by government, inadequately supported by secondary education, unpopular with the general public for a variety of reasons and a poor relation in those citadels of the establishment, the universities of Oxford and Cambridge, English science was crying out for recognition and support. It was falling seriously behind in the race with Continental science (especially German) and had no chance of occupying a leading place in English culture. In search of a weapon with which to acquire some kind of hegemony Thomas Henry Huxley and some close allies determined on a course of action. They would seek by all means within their power to undermine the privileged position of the Anglican church and put science in its place. A detailed strategy was worked out, an important element being the propagation of a conflict thesis in which the church was always portrayed as the vanquished party in its perennial fights with science. The books of Draper and White were at their service, and so was evolution which, as Huxley said, was a good stick with which to beat the church. With all the discipline of a military campaign the attack was launched and the conflict thesis enshrined as a self-evident truth in English culture. It was a demonstrably wrong but it carried the day, at least with the general public (see Russell 1989).
1.5 The Complementarity Thesis In the 1960s and 1970s the scientist\philosopher Donald MacKay made much of the paradigm of complementarity. Derived from modern physics, where, for instance, wave and particle descriptions of light are complementary, the notion was applied to explanations from science and theology. It was the antithesis of a reductionist view where objects or events were nothing but phenomena of one kind (usually materialist). This view did not preclude traffic from one area to another, and did not deny their mutual influence. Indeed MacKay was committed strongly to the Hooykaas thesis. But it did mean that science and theology could co-exist as complementary enterprises. We may call this the complementarity thesis. Its merits include the ability to account for several phenomena of modern and recent science. One of these is the large number of scientists who hold to Christian beliefs. In Huxley’s day they included many of the most eminent physical scientists as Faraday, Joule, Stokes, and Lord Kelvin. Statistical surveys undertaken since that day have confirmed the same general trend, which is quite the opposite of common expectation. In the early years of the twenty-first century the numbers appear to be increasing. The only reasonable explanation is that such people regard their 13623
Science and Religion scientific and religious activities as complementary and nonthreatening. Membership data and corporate activities of organisations like the American Scientific Affiliation and Christians in Science appear to bear this out. In modern times an interesting tendency strengthens still further the case for a complementary approach. That is the rebirth of natural theology, in which the natural universe is seen to testify to the power and goodness of God. Such arguments go back to Biblical times, but were advocated with fresh vigour in the late eighteenth and early nineteenth centuries, reaching their peak with Natural Theology by William Paley, first appearing in 1802. The argument for a Designer from the apparent design of the natural world received hard blows from the philosophical critique of Hume and the evolutionary theories of Darwin. The proponents of the conflict thesis added their opposition. Yet over 100 years later the disclosures of modern science have caused some cynics to have second thoughts, and have encouraged scientist\theologians like John Polkinghorne to revive a form of natural theology that does justice to a natural world infinitely more complex than that perceived by Paley or even Darwin.
universe as an alternative to the view of it as a machine which can be used or abused at whim. They have been encouraged by the Gaia hypothesis of James Lovelock which emphasizes the complexity of earth and its biosphere and its capacity for self-regulation. This had led several (including Lovelock) to posit the earth as in some sense ‘alive.’ From this it has been but a step to invoke the earth (Gaia) as a mother and even a goddess. If taken to extremes this view may lead to a greater sensitivity to the needs for sustainable development and other desirable attitudes to the earth. But it may also lead to a reversion to a prescientific animistic or even divine universe in which the benefits of a thriving science are rejected as minimal. If, however, the values of Christian monotheism are integrated with these views it may be possible for science to prosper, but always in a spirit of responsible stewardship. At present, we seem a long way from either of these positions.
2. Beyond Traditional Christianity
Bibliography
It remains to note that orthodox Christianity has no monopoly in engagement with science, though historically it happens to have been the most prominent faith to do so. Much literature on science and religion has focused, therefore, on the Christian faith. A modern variant of Christianity is sufficiently different to justify the term heterodox; that is the so-called process theology that denies omniscience to God and regards natural processes as contributing to his selffulfillment. It has been extensively used in discussions about indeterminacy in nature, about human free will and about the phenomenon of evil. Though favoured by some philosophers and theologians it cuts little ice with the majority of scientists. The other great monotheistic religions of Judaism and, to a smaller extent, Islam have encouraged their followers in the study of nature though, operating in very different cultures and with different presuppositions, their effects have been rather unlike those of Christianity. The eastern mystical religions have arguably been less important for the growth of science, though some parallels between atomic physics and Buddhism have been emphasized by Franz Capra. The early promise of Chinese science was not fulfilled because the concept of scientific law was hardly present, reflecting, according to the historian Joseph Needham, the absence of any monotheistic law-giver in ancient Chinese culture. Finally, the growth of environmental awareness has led some to espouse an almost pantheistic view of the
Barbour I 1990 Religion in an Age of Science. Harper, San Francisco Brooke J H 1991 Science and Religion: Some Historical Perspecties. Cambridge University Press, Cambridge, UK Finocchiaro M A 1989 The Galileo Affair: A Documentary History. University of California Press, Berkeley, CA Harrison P 1998 The Bible, Protestantism and the Rise of Natural Science. Cambridge University Press, Cambridge, UK Hooykaas R 1972 Religion and the Rise of Modern Science. Scottish Academic Press, Edinburgh, UK Jeeves M A, Berry R J 1998 Science, Life and Christian Belief. Apollos, Leicester, UK Kaiser C 1991 Creation and the History of Science. Marshall Pickering, London Lindberg D C, Numbers R (eds.) 1998 God and Nature: Historical Essays on the Encounter Between Christianity and Science. University of California Press, Berkeley, CA Livingstone D N, Hart D G, Noll M A (eds.) Eangelicals and Science in Historical Perspectie. Oxford University Press, New York Moore J M 1979 The Post-Darwinian Controersies: A Study of the Protestant Struggle to Come to Terms with Darwin in Great Britain and America 1870–1900. Cambridge University Press, Cambridge, UK Nebelsick H P 1992 The Renaissance, the Reformation and the Rise of Science. T. & T. Clark, Edinburgh, UK Polkinghorne J 1990 One World: The Interaction of Science and Theology. SPCK, London Polkinghorne J 1991 Reason and Reality: The Relationship Between Science and Theology. SPCK, London Russell C A 1985 Cross-currents: Interactions Between Science and Faith. Inter-Varsity Press, Leicester, UK (corrected and reprinted, Christian Impact, London, 1996)
13624
See also: Creationism, Evolutionism, and Antievolutionism; Religion and Politics: United States; Religion and Science; Religion: Evolution and Development; Religion, History of; Religion, Psychology of; Religion, Sociology of
Science and Social Moements Russell C A 1989 The conflict metaphor and its social origins. Science and Christian Belief 1: 3–26 Russell C A 1994 The Earth, Humanity and God. University College Press, London White L 1967 Science 155: 1203
C. A. Russell
spokespersons have offered critical perspectives on both the form and content of science, as well as on the uses to which science has been put. In recent decades, a range of so-called new social movements have criticized the dominant biases and perspectives of many scientific and technological fields. This article first presents these different forms of interaction between science and social movements in general, historical terms. It then discusses contemporary relations between science and social movements, and, in particular, the environmental movement.
Science and Social Movements The social movements that are discussed in this article refer to major political processes of popular mobilization that have significant societal significance, and include organized forms of protest as well as broader and more diffuse manifestations of public opinion and debate. In relation to science, social movements conceived in this way can be seen to have served as seedbeds for new modes of practicing science and organizing knowledge more generally, and also as sites for critically challenging and reconstituting the established forms of scientific activity.
1.
Introduction
Since they fall between different academic specializations and disciplinary concerns, the relations between science and social movements have tended to be a neglected subject in the social sciences. For students of social movements in sociology and political science, the relations to science have been generally of marginal interest and received little systematic attention (Dellaporta and Diani 1999), while in science and technology studies, social movements usually have been regarded more as contextual background than as topics of investigation in their own right. Considering their potential significance as ‘missing links,’ both in the social production of knowledge and in broader processes of political and social change, however, the relations between science and social movements deserve closer scrutiny. For one thing, new ideas about nature and society have often first emerged outside the world of formal scientific activity, within broader social and political movements. Social movements have also often provided audiences, or new publics, for the spreading and popularization of scientific findings and results. Movements of religious dissent were among the most significant disseminators of the new experimental philosophy in the mid-seventeenth century, for example, while, in the nineteenth and twentieth centuries, new approaches to medicine, social relations, gender roles, and the environment found receptive audiences within social and political movements. Most visibly, social movements and their
2. Historical Perspecties A fundamental insight of the sociology of knowledge, as it developed in the 1920s and 1930s, was to suggest that modern science emerged in the seventeenth century out of a more all-encompassing struggle for political freedom and religious reform (Scheler 1980\1924; Merton 1970\1938). What eventually came to be characterized as modern science represented an institutionalized form of knowledge production that, at the outset, had been inspired by more sweeping social and political transformations. The historical project of modernity did not begin as a new scientific method, or a new mechanical worldview, or, for that matter, as a new kind of state support for experimental philosophy in the form of scientific academies. As for the Reformation, it arose as a more deep-seated challenge to traditional ways of thought in social and religious life. It was a protest against the Church (which is why the members of the movement were called protestants), and it was an encompassing social and cultural movement that articulated and practiced alternative, or oppositional forms of religion, politics, and learning as part of its political Table 1 A schematic representation of the scientific revolution from movement to institution From moement …
To institution
Social reform Millenarian vision Connection to radical politics Decentralized structure Democratic\open to all Spiritual (absolute) knowledge
Reform of philosophy Experimental program Rejection of politics
Technical-economic improvement Informal communication pamphlets, correspondence
Scientific development
Central academy Elitist\professional Instrumental (probabalistic) knowledge
Formal communication journals, papers
13625
Science and Social Moements struggle (Mandrou 1978). The teachings of Paracelsus, Giordano Bruno, and Tomaso Campanella, to name some of the better known examples, combined a questioning of established religious and political authority with an interest in scientific observation, mathematics, mechanics, and technical improvements (Merchant 1980). But as the broader movements were replaced by more formalized institutions in the course of the seventeenth century, the political and social experiments came to be transformed into scientific experiments; the political and religious reformation, tinged with mysticism and filled with mistrust of authority, was redefined and reconstituted, at least in part, as a scientific revolution (see Table 1). In the later seventeenth and early eighteenth centuries, the scientific ‘aristocracy’ that had emerged in London and Paris at the Royal Society and the Academie des Sciences was challenged by dissenting groups and representatives of the emerging middle classes, some of whom fled from Europe to the colonies in North America, and some of whom established scientific societies, often in provincial areas in opposition to the established science of the capital cities. Many of them shared with the academicians and their royal patrons a belief in what Max Weber termed the protestant ethic, that is, an interest in the value of hard work and the virtue of making money, and most had an interest in what Francis Bacon had termed ‘useful knowledge.’ The social movements of the enlightenment objected, however, to the limited ways in which the Royal Society and the Parisian Academy had organized the scientific spirit and institutionalized the new methods and theories of the experimental philosophy. The various attempts to democratize scientific education in the wake of the French Revolution and to apply the mechanical philosophy to social processes—that is, to view society itself as a topic for scientific research and analysis—indicate how critique and opposition helped bring about new forms of scientific practice. Adam Smith’s new science of political economy developed in the Scottish hinterland, and many of the first industrial applications of experimentation and mechanical philosophy took place in the provinces, rather than in the capital cities, where the scientific academies were located (Russell 1983). Many of the political and cultural trends of the nineteenth century—from romanticism and utopianism to socialism and populism—also began as critical movements that at a later stage, and in different empirical manifestations, came to participate in reconstituting the scientific enterprise. Romanticism, for example, first emerged as a challenge to the promethean ambitions of science, in the guise, for example, of Mary Shelley’s mad Doctor Frankenstein, while socialism, in the utopian form promulgated by Robert Owen, was a reaction, among other things, to the problematic ways in which the technological applications of scientific activity were being spread into 13626
modern societies. Romantic writers and artists, like William Blake, Johann Goethe, and later Henry David Thoreau, were critical of the reductionism and cultural blindness of science and of the ‘dark satanic mills’ in which the mechanical worldview was being applied. They sought to mobilize the senses and the resources of myth and history in order to envision and create other ways of knowing (Roszak 1973). Some of their protest was transformed into constructive artistic creation, and later in the century, the romantic revolt of the senses inspired both the first waves of environmentalism in the form of conservation societies and the emergence of a new kind of holistic scientific discipline, to which was given the name ecology: household knowledge. In the late nineteenth century, the labor movement also sought a deep-going, fundamental political, and cultural transformation of society, but it would see its critical message translated, in the twentieth century, into packages of reforms and a more welfare-oriented capitalism, on the one hand, and into the ‘scientific socialism’ of Leninand Stalin on the other (Gouldner 1980). Once again, however, science benefited from this institutionalization of the challenge of social movements; the knowledge interests of the labor movement, for example, entered into the new social science disciplines of economics and sociology. A political challenge was once again transformed into programs of scientific research and state policy; but while new forms of scientific expertise were developed, little remained of the broader democratization of knowledge production that the labor movement, in its more radical days, had articulated. In the early twentieth century, the challenge to established science was mobilized most actively in the colonies of European imperialism, as well as in the defeated imperialist powers of Germany and Italy; the critique was primarily of scientific civilization writ large, and of what Mahatma Gandhi in India called its ‘propagation of immorality.’ In the name of modernism, science stood for the future and legitimated, in the colonies as well as in Europe, a wholesale destruction of the past and of the ‘traditional’ knowledges—the other ways of knowing—that had been developed in other civilizations (Tambiah 1990). These social movements had an impact on the development of political ideology on both the right and left, but also inspired new sciences of ethnography and anthropology, as well as the sociology of knowledge. Even more important perhaps were the various attempts to combine the artisanal knowledges of the past and of other peoples with the modern science and technology of the present in new forms of architecture, design, and industrial production. Many of the regional development programs of the 1930s and 1940s, in Europe and North America, can trace their roots back to the cultural critique of modernism that was inspired by the Indian independence movement and by such ‘movement intellectuals’ as William Morris and
Science and Social Moements Patrick Geddes in Britain. Both Roosevelt’s New Deal and the Swedish model welfare state can be said to have mobilized civilizationally critical perspectives in their projects of social reconstruction.
3. Science and Contemporary Social Moements In recent decades, social movements have also served to challenge and reorient the scientific enterprise. Out of the anti-imperialist and student movements of the 1960s and the environmentalist, feminist, and identity movements of the 1970s and 1980s have emerged a range of alternative ideas about science, in form, content, and meaning, that have given rise to new scientific theories, academic fields, and technological programs (Schiebinger 1993, Harding 1998). Out of critique have grown the seeds of new, and often more participatory, ways of sciencing—from technology and environmental impact assessment to women’s studies, queer theory, and postcolonial discourses. What were in the 1960s and 1970s protest movements of radical opposition largely have been emptied of their political content, but they have given rise to new branches of, and approaches to, science and technology. While the more radical, or oppositional, voices have lost much of their influence, the more pragmatic and scientific voices have been given a range of new opportunities. This is not to say that there is no longer a radical environmental opposition or a radical women’s liberation movement, but radicals and reformists increasingly have drifted apart from one another, and in most countries now work in different organizations, with little sense of a common oppositional movement identity. There has been a fragmentation of what was a coherent movement into a number of disparate bits and pieces (Melucci 1996). The new social movements rose to prominence in the downturn of a period of institutional expansion and economic growth. They emerged in opposition to the dominant social order and to its hegemonic scientific technological regime, which had been largely established during and immediately after World War II (Touraine 1988). The war led to a fundamental transformation in the world of science and technology, and to the emergence of a new relation, or contract, between science and politics. Unlike previous phases of industrialization, in which science and engineering had lived parallel but separate identities, World War II ushered in the era of technoscience. The war effort had been based on an unprecedented mobilization of scientists to create new weapons, from radar to the atomic bomb, and to gather and conduct intelligence operations. In the process, ‘little’ science was transformed into big science, or industrialized science. Especially important for the social movements that were to develop in the 1960s and beyond was the fact that scientific research was placed at the center of postwar economic development. Many of the econ-
omically significant new products—nylon and other synthetic textiles, plastics, home chemicals and appliances, television—were based directly on scientific research, and the new techniques of production were also of a different type: it was the era of chemical fertilizers and insecticides, of artificial petrochemicalbased process industries, and food additives. The forms of big science also differed from the ways in which science had been organized in the past. The big science laboratories—both in the public and private sectors—were like industrial factories, and scientifictechnical innovation came to be seen as an important concern for business managers and industrial organizers. The use of science in society had become systematized, and, as the consequences of the new order became more visible, new forms of mistrust and criticism developed (Jamison and Eyerman 1994). One wing of the public reacted to the destruction of the natural environment, what an early postwar writer, Fairfield Osborn, termed ‘man’s war with nature.’ The exploitation of natural resources was increasing and, in the 1940s and 1950s, it began to be recognized that the new science-based products were more dangerous for the natural environment than those that had come before. But it would only be with Rachel Carson, and her book Silent Spring in 1962, that an environmental movement began to find its voice and its characteristic style of expression. It was by critically evaluating specific instances of scientific technology, particular cases of what Osborn had called the ‘flattery of science’ that the environmentalist critique would reach a broader public. Carson singled out the chemical insecticides for detailed scrutiny and assessment, but her point was more general. Carson’s achievement was to direct the methods of science against science itself, but also to point to another way of doing things, the biological or ecological way—what she called in her book the road not taken. Another source of inspiration for the new social movements came from philosophers and social historians who questioned the more general impact of technoscience on the human spirit. It was onedimensional thinking which critical theorists like Herbert Marcuse reacted against, the dominance of an instrumental rationality over all other forms of knowing. For Lewis Mumford, another major source of inspiration, it was the homogenization of the landscape that was most serious, the destruction of the organic rhythms and flows of life that had followed in the wake of postwar economic growth, as well as the dominance of what he termed the ‘megamachine,’ the use of technology for authoritarian purposes. In the 1970s, a range of new social movements, building on these and other sources of inspiration, came to articulate an oppositional, or alternative approach to science and technology (Dickson 1974). The so-called new social movements represented an integrated set of knowledge interests, which combined the critique of Rachel Carson with the liberation 13627
Science and Social Moements dialectics of Marcuse and the direct democracy of the student movement. The new movements involved a fundamental critique of modern science’s exploitative attitude to nature, as well as an alternative organizational ideal—a democratic, or participatory ideal—for the development of knowledge. There were also distinct forms of collective learning in the new social movements of environmentalism and feminism, as well as grass-roots engineering activities that went under the name of appropriate or alternative technology.
4.
Social Moements as Cognitie Praxis
On the basis of these historical and contemporary relations between science and social movements, many social movements can be characterized as producers of science (Eyerman and Jamison 1991). The critical ideas and new public arenas that are mobilized by social movements often provide the setting for innovative forms of cognitive praxis,combining alternative worldviews, or cosmological assumptions with alternative organizational and practical–technical criteria. In the case of environmentalism, the cosmology was, to a large extent, the translation of a scientific paradigm into a socioeconomic paradigm; in the 1970s, the holistic concepts of systems ecology were transformed into political programs of social ecology, and an ecological worldview was to govern social and political interactions. Technology was to be developed under the general perspective that ‘small is beautiful’ (in the influential phrase of E. F. Schumacher), and according to the assumption that large-scale, environmentally destructive projects should be opposed and stopped. At the same time, new contexts for education and experimentation and the diffusion of research were created in the form of movement workshops and, in the Netherlands, for example, in the form of science shops, which allowed activist groups to gain access to the scientific expertise at the universities (Irwin 1995). In the 1980s, this cognitive praxis was decomposed largely into a disparate cluster of organizations and individuals, through processes of professionalization and fragmentation. The knowledge interests of the environmental movement were transformed into various kinds of professional expertise, which made it possible to incorporate parts of the movement into the established political culture, and to shift at least some of the members of the movement from outsider to insider status. Some of the alternative technical projects proved commercially viable—biological agriculture, wind energy plants, waste recycling—and gave rise to a more institutionalized form of environmental politics, science, and technology (Hajer 1995). Similar processes can be identified in relation to other social movements of recent decades (Rose 1994). The political struggles for civil rights, women’s and sexual liberation, and ethnic and national identity 13628
have inspired new approaches to knowledge that have since been institutionalized and transformed into established scientific fields, such as women’s studies, gay and lesbian studies, African-American studies, as well as in new areas of medicine and technology. At the same time, science itself has been reconstituted, partly as a result of the critical perspectives and cognitive challenges posed by social movements. Many of the postmodern theories of the cultural and human sciences have been inspired by the experiences of social and political movements. Out of the alternative public spaces that have been created by social and political movements has emerged a new kind of scientific pluralism, in terms of organization, worldview assumptions, and technical application. In the transformations of movements into institutions, a significant channel of cognitive and cultural change can thus be identified. It may be hoped, in conclusion, that these interactions between science and social movements will receive more substantial academic attention in the future. See also: History of Science; Innovation, Theory of; Kuhn, Thomas S (1922–96); Marx, Karl (1818–89); Social Movements, History of: General; Social Movements: Psychological Perspectives; Social Movements, Sociology of
Bibliography Dellaporta D, Diani M 1999 Social Moements. An Introduction. Routledge, London\Blackwell, Oxford, UK Dickson D 1974 Alternatie Technology and the Politics of Technical Change. Fontana, London Eyerman R, Jamison A 1991 Social Moements. A Cognitie Approach. Penn State University Press, University Park, PA Gouldner A 1980 The Two Marxisms. Contradictions and Anomalies in the Deelopment of Theory. Seabury University Press, New York Hajer M 1995 The Politics of Enironmental Discourse. Ecological Modernization and the Policy Process. Oxford University Press, Oxford, UK Harding S 1998 Is Science Multicultural? Postcolonialisms, Feminisms, and Epistemologies. Indiana University Press, Bloomington, IN Irwin A 1995 Citizen Science. A Study of People, Expertise and Sustainable Deelopment. Routledge, London Jamison A, Eyerman R 1994 Seeds of the Sixties. University of California Press, Berkeley, CA Mandrou R 1978 From Humanism to Science 1480–1700. Penguin Books, Harmondsworth, UK Melucci A 1996 Challenging Codes: Collectie Action in the Information Age. Cambridge University Press, Cambridge, UK Merchant C 1980 The Death of Nature. Women, Ecology, and the Scientific Reolution. Harper and Row, San Francisco Merton R 1970 (1938) Science, Technology, and Society in Seenteenth-century England. Harper and Row, New York Rose H 1994 Loe, Power and Knowledge. Toward a Feminist Transformation of the Sciences. Polity Press, Cambridge, UK Roszak T 1973 Where the Wasteland Ends: Politics and Transcendence in Post-industrial Society. Anchor, New York
Science and Technology, Anthropology of Russell C 1983 Science and Social Change 1700–1900. Macmillan, London Scheler M 1980 (1924) Problems of a Sociology of Knowledge. Routledge, London Schiebinger L 1993 Nature’s Body: Gender in the Making of Modern Science. Beacon, Boston Tambiah S 1990 Magic, Science, Religion and the Scope of Rationality. Cambridge University Press, Cambridge, UK Touraine A 1988 The Return of the Actor. Social Theory in Postindustrial Society. University of Minnesota Press, Minneapolis, MN
A. Jamison
Science and Technology, Anthropology of The anthropology of science and technology is an expanding arena of inquiry and intervention that critically examines the cultural boundary that sets science and technology off from the lives and experiences of people. Most research in this arena draws on ethnographic fieldwork to make visible the significance and force of this boundary, as well as meanings and experiences that cut across it or otherwise get hidden. Key categories of projects include juxtaposing shared systems of cultural meaning, following flows of metaphors back and forth across the boundary, and retheorizing or relocating the boundary itself in ways that reconnect science and technology to people. While advancing understanding of how people position science and technology in both specialized and everyday worlds, the anthropology of science and technology also calls attention to cultural possibilities whose realization might help expand participation in decision-making about science and technology.
1. Helping STS Fulfill its Dual Objecties The unique contribution of this field to science and technology studies (STS) is that it helps revive and refocus questions regarding relations among science, technology, and people. STS research and researchers are held together by their diverse, yet collective, efforts to trouble and transform the dominant, but simplistic, model or image of science and technology in society. According to the dominant model, researchers live in specialized technical communities whose deliberations are essentially opaque and presumably free of cultural content. Knowledge, in the singular, is created by these bright, well-trained people located inside the academy and then diffuses outside into the public arena through mechanisms of education, popularization, policy, and the benefits of new technologies. Its social significance is evaluated exclusively in the public arena, where knowledge is used, abused, or ignored. The outward travel of knowledge preserves the autonomy of creation and establishes a sharp
boundary between science and technology on the one side and people on the other, including the actual lives of scientists and technologists. STS has sought to engage this model in two ways. First, it brings together researchers who analyze the conceptual and the social dimensions of science and technology simultaneously and in historical perspective. Second, by offering new ways of thinking, STS promises to afford society new pathways for confronting and resolving problems that involve science and technology. STS thus offers the dual trajectories of theory and intervention, of proposing new frameworks of interpretation, and participating critically in societal problem solving. These activities complement one another. Good theory defines pathways that make a difference, and successful acts of critical participation depend upon novel theoretical insights. During the 1980s and early 1990s, a major focus within STS was on a philosophical debate between ‘objectivism’ and ‘(social) constructivism.’ Constructivism provided a major theoretical advance over 1970s research on the ‘public understanding of science’ and the ‘impacts’ of science and technology, which tended to take for granted the internal contents of science and technology (see Technology Assessment). One important body of constructivist work was often labeled ‘anthropology of science’ because it relied on the direct observation of scientists in laboratories to link scientific practices to knowledge development. For more information on this formative strand in the anthropology of science (see Laboratory Studies: Historical Perspecties; Actor Network Theory). By questioning how science and technology gain internal contents, the philosophical debate between objectivism and constructivism concentrated attention on the science\technology side of the cultural boundary with people. Also, a focus on the emergence and stabilization of either new scientific knowledge or new technological artifacts tended to maintain the more general cultural separation between science and technology (see Technology, Social Construction of ). The newer strands in the anthropology of science and technology explicitly foreground the interventionist potential of STS by moving back and forth across the boundary between science\technology and people, investigating its force and making visible meanings and experiences that get hidden when it is taken for granted. In this work, the commitment to ethnographic practices forces attention to alternative pathways for critical participation as an integral part of theoretical innovation (see also Ethnography).
2. Helping Anthropology Rethink Culture As recently as 1988, the American Anthropological Association (AAA) rejected proposed panels on the anthropology of science and technology on the grounds such work did not fit under the AAA umbrella. Things had changed by 1992 as a series of 13629
Science and Technology, Anthropology of panels on ‘cyborg anthropology’ and on the work of Donna Haraway attracted standing-room-only audiences in a ballroom setting. Key to this shift was a growing recognition that anthropological debates over the status of cultural analysis and cultural critique were similar to developments and deliberations in STS. Anthropologists grew hopeful that analyzing and critiquing the dominant model of science and technology in cultural terms might further people’s understanding of how sciences and technologies, including anthropology, live in society. Emerging questions included: what are the implications of accounting for science, technology, and people in cultural terms? Might the finding of wider cultural meanings in science and technology improve searches for alternative configurations? To what extent has anthropology itself depended on the dominant model of science and technology in society? How can one participate critically in science and technology through ethnographic work? As it emerged from symbolic anthropology during the 1970s, the concept of culture drew from the predominant model of language as an underlying grammar, structure, or system of symbols and meanings (see Culture: Contemporary Views). Beneath surface differences in speech and action among competent participants in a given community lies a more fundamental sharedness, a bounded ‘culture.’ This concept of cultures as sets of shared assumptions or presuppositions depends on a contrast with the concept of ‘nature.’ Where nature provides people with base needs and desires, culture provides content and meaning. Until the 1980s, it appeared difficult to apply the concept of culture to science and technology. While theorizing culture as a bounded system helped in accounting for differences across cultures, this approach offered no means of accounting for differences within cultures. Yet research questions in STS typically focused on the latter. Also, much research in anthropology itself, for example, kinship theory, depended on the nature\culture distinction for its legitimacy. Treating anthropological work itself as a cultural enterprise could have threatened to undermine the discipline. As Franklin (1995) put it, by the 1990s ‘several trajectories coalesce[d] to produce … momentum’ for an emerging anthropology of science and technology. Feminist anthropology had made visible the role of biological assumptions in gender and kinship studies (see Kinship in Anthropology). Poststructuralism made it possible to think about power–knowledge relationships and to make the conceptual limitations of the ‘human’ a focal point for social theory (see Knowledge, Sociology of; Postmodernism in Sociology). Postmodernist critiques of science and technology called attention to the processes of their production, making ‘progress’ a contingent effect (see Cultural Studies of Science). In revealing how anthropologists ‘write culture’ by turning people into ‘others,’ post13630
modernism in anthropology introduced ‘cultural critique’ as an anthropological practice (see Cultural Critique: Anthropological). Cross-cultural comparisons of knowledge systems began to reposition Western science as ethnoscience (see Knowledge, Anthropology of; Indigenous Knowledge: Science and Technology Studies; Postcoloniality). The rise of interdisciplinary cultural studies juxtaposed ‘popular’ with ‘high’ cultural forms, revealing power relations between the two and retheorizing culture as a site of active work (see Cultural Studies of Science). Feminist critiques of science demonstrated its saturation with gender metaphors, and feminist critiques of reproductive technologies provided compelling accounts of women’s experiences that could not be counted unproblematically as the benefits of innovation (see Feminist Epistemology; Gender and Technology). Cross-disciplinary interests in emerging transnational forms demanded simultaneous attention to technology, knowledge, and capital (see Globalization, Anthropology of; Postcoloniality; Capitalism: Global). Growing demands for accountability across the academy in general fueled interest in how ethnographic research can make a difference in the arena under study (see Ethnography; Adocacy in Anthropology). During the early 1990s, cultural anthropologists studying science and technology actively resisted the label ‘anthropology of science and technology’ on the grounds it masked these trajectories, confining diverse work to a bounded subdiscipline. The label works only to mark a collection of intellectual activities within which competition exists not to achieve domination but to make a difference and where border crossing is accepted to enhance the life chances of what the dominant model submerges. Ongoing projects fall into roughly three categories, with individual researchers and studies often contributing to more than one.
3. Juxtaposing Cultural Systems of Meaning The publication of Sharon Traweek’s Beamtimes and Lifetimes in 1988 marked the emergence of projects that identify and juxtapose cultural systems of meaning in science and technology. Rather than focusing solely on theory change in science, Traweek provides ‘an account of how high energy physicists see their own world; how they have forged a research community for themselves, how they turn novices into physicists, and how their community works to produce knowledge’ (1988, p. 1). Studying the culture of highenergy physicists extends and transforms the anthropological project of cross-cultural comparison by studying people who live in more than one culture at the same time, in this case those of the international physics community, Japan, and the United States. Demonstrating sharedness among high-energy physicists and then setting physicists off as a community
Science and Technology, Anthropology of achieves two key contributions. It repositions the dominant model of science and technology, which physicists both embrace and embody, into one among many possible cultural perspectives without suggesting that all perspectives on physical knowledge are equal. The ethnographic approach also locates the researcher within the power relations that constitute the field of study and makes visible ways in which scientists function as people who live social lives. The juxtaposition of shared systems of meaning has proven fruitful in analyzing public controversies over science and technology. Articulating subordinate perspectives and locating these alongside dominant ones demonstrates that factual claims gain meaning in terms of more general frameworks of interpretation. Such work also intervenes in power relations in ways that highlight mediation and collaboration as possible pathways for resolution (see also Scientific Controersies). Representative contributions in this category juxtapose Brazilian spiritists, parapsychologists, and proponents of bacterial theories of cancer with ‘orthodox’ perspectives (David Hess); antinuclear weapons groups with nuclear weapons scientists (Hugh Gusterson); creation science with evolutionary science (Christopher Toumey); antinuclear power with pronuclear power groups (Gary Downey); artificial intelligence researchers with expert system users (Diana Forsythe); advocates of midwifery with obstetrics (Robbie Davis-Floyd); Marfan scientists with technicians and activists (Deborah Heath); a unified Europe with separate nation–states seeking space travel (Stacia Zabusky); nuclear power plant operators with plant designers (Constance Perin); and competing workgroups in industry (Frank Dubinskas). Recent work questions the culture\people relationship by understanding cultures as, for example, shifting discourses (Gusterson) or recursive, cross-cutting perspectives in a social arena (Hess). Future work will likely review the notion of sharedness, no longer asserting it as a condition of anthropological analysis but exploring what gets accomplished when it can be demonstrated empirically.
4. Following Flows of Metaphors Across the Boundary The publication of Emily Martin’s The Woman in the Body in 1987 marked the emergence of projects that follow flows of metaphors from general cultural life into science and technology and back again into people’s lives and experiences. Writing a cultural history of the present, Martin demonstrates how medical conceptions of menstruation and menopause draw upon metaphors of production in portraying these as breakdown, the failure to produce. In addition to contrasting descriptions from medical textbooks with an ethnographic account of the actual experiences
of women, Martin also experiments with re-imagining menstruation and menopause as positive processes of production. Following flows of metaphors brings a new approach to the study of public understanding and of what the dominant model of science and technology characterizes as impacts. It calls direct attention to the existence and life of the cultural boundary that separates science and technology from people. How do people import scientific facts and technological artifacts into their lives and worlds? How does the dominant model of science and technology live alongside and inform other cultural meanings? Like the strategy of juxtaposition, following metaphors focuses attention on how people involve themselves with science and technology, making visible those meanings and experiences that do not have a place within the dominant model alongside those that do. What sorts of effects do scientific facts have in people’s bodies and lives? How do experiences with technologies go beyond the simply positive, negative, or neutral? How do emerging sciences and technologies contribute to the fashioning of selves? Such work makes visible patterns and forms of difference that may not correlate with demographic categories of race, gender, and class. Finally, following metaphors introduces the possibility of speculating on alternative cultural possibilities, including narrative experiments with counterfactual scenarios. While sharing with readers an anthropological method of critical thinking, such speculation also forces more explicit attention to how anthropological accounts participate critically within their fields of study. How, for example, might choices of theory and method combine with aspects of researchers’ identities to shape pathways of intervention? Work along these lines follows how sonography and amniocentesis accelerate the introduction of medical expertise into pregnancies (Rayna Rapp); the cultural origins of Western science and its shifts over time (David Hess); how people calculate in diverse settings (Jean Lave); the cultural meanings of medicine in everyday life (Margaret Lock); the narrative construction of scientists’ autobiographies (Michael Fischer); creative adjustments when medical expertise fails to account for pregnancy loss (Linda Layne); the flows of meanings that constituted the reproductive sciences as sciences (Adele Clarke); what living with gender assumptions means for women scientists (Margaret Eisenhart); how Bhopal lives as fact and image in different enunciatory communities (Kim Fortun); and how patents travel around the world (Marianne De Laet). Future work is likely to elaborate questions of scale and identity. How do metaphors gain effects at different scales and how do meanings that live at different scales combine in people’s lives and selves? How do people respond to meanings that summon 13631
Science and Technology, Anthropology of and challenge them, positioning themselves in searches for identities that work (see also Identity in Anthropology).
5. Retheorizing the Boundary between Science\Technology and People The publication of Donna Haraway’s Simians, Cyborgs, and Women in 1991 marked the emergence of projects that retheorize the boundary itself. Moving through the project of following flows of metaphors to reimagine categories that ‘implode’ on one another, Haraway calls for ‘pleasure in the confusion of boundaries and … responsibility in their construction’ (1991, p. 150). Claiming the cyborg (see Cyborg) as a feminist icon and key marker of what she calls the ‘New World Order,’ Haraway forges attention to the contemporary dissolution of boundaries between human and animal, human and machine, and physical and nonphysical. Challenging the ‘god-trick’ of universalism, she poses ‘situated knowledges’ (see Situated Knowledge: Feminist and Science and Technology Studies Perspecties) as a means of holding simultaneously to a radical historical contingency and a no-nonsense commitment to faithful accounts of a ‘real’ world. This project calls attention to burgeoning collections of activities involving science and technology that live across the purported separation between them or across their boundary with people. It includes the exploration of emerging fields that do not fit conventional disciplinary categories, such as biotechnology, biomedicine, and bioengineering, and documents the decline and loss of a distinction between basic and applied science. Following novel activities in research and production motivates the invention of new labels for the anthropological object of study, including, for example, ‘technoscience’ and ‘technoculture.’ The project of retheorizing the boundary between science\technology and people highlights the presence of the nature\culture distinction in anthropology of science and technology as well as in other areas of STS inquiry and intervention. Pressing questions include: through what sorts of processes do analytic findings and interpretations become naturalized as facts in everyday, or popular, modes of theorizing? In what ways might analytic accounts, including claims about culture, depend upon facts from popular theorizing? How might new modes of theorizing about analysis contribute to rethinking the nature\culture distinction itself? Finally, by calling attention to the difficulty of living within existing categories while attempting to theorize and embody new ones, retheorizing the boundary sharpens the question of intervention. Locating experiences that live across categories invites researchers 13632
to examine how emergent categories impact and inflect old ones. Through what sorts of pathways might reformulations actually intervene and achieve change in specific cases? How might it be possible to assess the extent to which reformulations and refigurations prove, in fact, to be helpful, and to whom? Contributions to this diverse project explore how ideas of the natural help constitute cultural ways of knowing (Marilyn Strathern); the situatedness of practices in machine design and machine use (Lucy Suchman); opportunities for a ‘cyborg anthropology’ that studies people without starting with the ‘human’ (Gary Downey, Joseph Dumit, Sarah Williams); how reproductive technologies blur the facts of life (Sarah Franklin); emerging refigurations to constitute and categorize artificial life (Stefan Helmreich); the activities of biotechnology scientists who move outside the academy to gain academic freedom (Paul Rabinow); the importance of ‘good hands’ and ‘mindful bodies’ in laboratory work (Deborah Heath); how practices of tissue engineering materialize new life forms (Linda Hogle); experiences with PET scanning that escape the nature\culture distinction (Joseph Dumit); the contemporary production of ‘cyborg babies’ (Robbie Davis-Floyd and Joseph Dumit); experiences of computer engineers that belie the separation of human and machine (Gary Downey); possibilities for reinvigorating general anthropology by locating technology and humans in the unified frame of cyborg (David Hakken); and how African fractal geometry escapes classification in either cultural or natural terms alone (Ron Eglash) (See also Actor Network Theory for an approach that defines both humans and nonhumans as ‘actants’.) Future work is likely to forge novel alliances among theoretical perspectives previously separated by the nature\culture distinction. Actively locating academic theorizing in the midst of popular theorizing will likely force modalities of intervention into focus. Finally, to the extent that the anthropology of science and technology succeeds in challenging and replacing the simplistic dominant model by blurring and refiguring the boundary between science\technology and people, wholly new projects of inquiry and intervention will have to emerge to take account of the changing context. See bibliography for additional reviews. See also: Actor Network Theory; Common Sense, Anthropology of; Cultural Studies of Science; Culture: Contemporary Views; Ethnography; Gender and Technology; Globalization, Anthropology of; History of Science; History of Technology; Identity in Anthropology; Indigenous Knowledge: Science and Technology Studies; Interpretation in Anthropology; Knowledge, Anthropology of; Science, New Forms of; Science, Sociology of; Scientific Culture; Scientific Disciplines, History of; Symbolism in Anthropology; Technology, Anthropology of
Science and Technology: Internationalization
Bibliography Davis-Floyd R, Dumit J (eds.) 1988 Cyborg Babies: From Techno-Sex to Techno-Tots. Routledge, New York Downey G L, Dumit J (eds.) 1998 Cyborgs & Citadels: Anthropological Interentions in Emerging Sciences and Technologies. SAR Press, Santa Fe, NM Franklin S 1995 Science as culture, cultures of science. Annual Reiew of Anthropology 24: 163–84 Haraway D 1991 Simians, Cyborgs, and Women: The Reinention of Nature. Routledge, New York Heath D, Rabinow P (eds.) 1993 Bio-politics: The anthropology of the new genetics and immunology. Culture, Medicine, Psychiatry 17(special issue) Hess D J 1995 Science and Technology in a Multicultural World: The Cultural Politics of Facts and Artifacts. Columbia University Press, New York Hess D J, Layne L L (eds.) 1992 The anthropology of science and technology. Knowledge and Society 9 Layne L L (ed.) Anthropological approaches in science and technology studies. Science, Technology, & Human Values 23(1): Winter (special issue) Martin E 1987 The Woman in the Body. Beacon Press, Boston Nader L (ed.) 1996 Naked Science: Anthropological Inquiry into Boundaries, Power, and Knowledge. Routledge, New York Traweek S 1988 Beamtimes and Lifetimes: The World of High Energy Physicists. Harvard University Press, Cambridge, MA
G. L. Downey
Science and Technology: Internationalization Science is predominantly done in national scientific communities, but scientific information regularly crosses boundaries. Various conditions influence this flow, which means that it is not symmetric in all directions. Internationalization is the uneven process wherein cross-border linkages of communication and collaboration in science and technology among countries multiply and expand. Linkages involve individual scientists and their institutions, but also, increasingly, governments through treaties and conventions that include strong science and technology components. The growing density of interconnections between large industrial corporations regarding research in the precompetitive phase as well as technological alliances is also relevant.
1. Some Distinctions 1.1 Scientific Internationalism Scientific internationalism emerged with the development of national academies of science (Crawford et al. 1993). Later, correspondence and interchange between individual scientists channeled through international scientific unions and congresses appearing
in the nineteenth century (Elzinga and Landstro$ m 1996). The flare-up of nationalism and World War I caused a temporary rupture in this internationalism, both in mode of organization and spirit. After the war, transnational scientific relations were gradually repaired, even with Germany, but at the cost of promoting a strict neutralist ideology, nurturing an image of science disembodied from and standing above society. 1.2 Scientific Internationalization and Economic Globalization The multiplier effect of interconnections between transnational corporations (TNCs) and financial institutions that shape trade related agreements and intellectual property regimes influences, but must not be confused with, scientific internationalization. It is a ‘globalization’ process in the economic sphere, driven by for-profit motives, facilitated by a combination of neoliberal politics, privatization and informationtechnological developments. Internationalization of science, by contrast, is ultimately predicated on trust and solidarity among intellectual peers, even though it is interpenetrating increasingly with economic globalization. A further distinction is between quantitative and qualitative aspects of internationalization. Quantitative studies deal with numbers and patterns of crossborder linkages, and multinational interconnectivity or collaboration and cooperation. They provide indications of changing trends, but for deeper insights need complementation by studies of qualitative changes, e.g., emergence of new institutional arrangements and incentive systems that facilitate international exchange, reorientation of local research agendas, or harmonization of approaches to policy and priorities, like foresight methodologies, at national and regional levels. Internationalization of industrial research and development (R&D) is now recognized as an important research topic. Statistical surveys and empirical case studies confirm increases in the numbers of multicountry patents held by individual firms, as well as a proliferation of technological alliances within and between a triad of trading blocks (NAFTA; North American Free Trade Agreement, EU; European Union, and Developing Asian Economies plus Japan). In this respect it is actually more appropriate to speak of a ‘triadization’ than economic globalization. Pertinent literature on internationalization of technology also refers to other types of interaction between TNCs, and it deals with various R&D management strategies in this context. But in this respect also, extant overviews are limited largely to providing taxonomies and typologies of discernible patterns (Research Policy 1999); (for further details, see National Innoation Systems and Technological Innoation). 13633
Science and Technology: Internationalization Qualitative aspects of internationalization also include epistemic change: i.e., in the intellectual contents of scientific fields, namely, dominant perspectives, methodologies and theories. Thematic integration appears along research fronts, as well as sectoral lines where problems transcend national boundaries (acid rain, global climate change, AIDS), or are too costly for a single nation to handle alone (e.g., CERN). In these instances, and more generally, when international research programs foster coordination of national contributions they also force standardization of data formats, preferred instrumentation and experimental practices. In what follows, transformations of interconnectivity consonant with internationalism is probed in cartographic, institutional and epistemic dimensions, with political aspects also thrown in relief.
2. Tracing the Span and Patterns of International Networks At the time of writing, the volume of science, national outputs per monies allocated per gross national product (GNP), and the distribution of research efforts and patenting across the globe are mapped regularly. This is done with the help of publication counts (of papers) and reviews of who cites whom (Price 1986) to trace citation patterns and coauthorship linkages (see Scientometrics). Visibility of scientists from different countries or in regions is compared. Evaluations of research performance use science and technology indicators as proxies for effectivity, quality and international standing of research groups and their institutions. The span of global networks increased during the last decades of the twentieth century. Coauthorship patterns reveal growing contacts of scientists across national borders during the 1970s and 1980s (Hicks and Katz 1996), with a certain slackening in the 1990s (UNESCO 1998). Leading scientific producers such as the USA engage less in international coauthorships than do smaller countries—the larger the national or regional scientific community, the greater is ‘self-reliance.’ Generally, internationalism is only played up in large countries when scientific leadership is in decline, or other countries possess desired specialty knowledge. An interesting anomaly is that India and China also figure lower on polls of cross-border co-authorships than might be expected. Latin America has remained constant, while Africa, especially its sub-Saharan part, stands out as most disadvantaged. Concentration of resources, prestige, authority and recognition thus display regional variations, with densities following continental contours dominated by an Anglophone region. The predominance of English as the main language of scientific communication, as evidenced in the Science Citation Index (SCI), is also 13634
increasing. International coauthorship furthermore reveals subclusters (e.g., the Scandinavian countries) influenced by geographical vicinity, historical traditions, and linguistic as well as cultural affinities. The number of Third World countries now participating in world science has increased, but the vast majority still belong to the scientific periphery (Schott 1998). The share of liberal democracies in world science 1986 was nearly five times their share of world population, while the poorer countries’ share in scientific production counted for only one-tenth of their share of the world population. With a few remarkable exceptions (India, Brazil, China) this global gap became even more exaggerated during the 1990s (Schott 1993). Increased connectivity, scope and participation in scientific communication across national borders, in other words, cannot be equated with decreasing hierarchization. Rather, ‘globalization’ of institutional models and participation in science is accompanied by a deglobalization in dispersion of science. This contrasts sharply with the notion of science as a public good. Scientific knowledge remains highly concentrated where it is first created, namely among OECD countries. The scientific centers in advanced industrial nations, by virtue of prestige and scientific achievement (often measured by relative density of Nobel laureates in specific fields) also exercise influence over work done in peripheral countries. A Eurocentric skew persists in power, resources, problem selection, and overriding perspectives on the content and function of science and technology. The end of the Cold War opened a new era of pluricentrism, primarily around three large regional blocks: North America, the European Union, plus Japan and developing economies in Asia. These cleavages are reflected in transnational citation impact and coauthorship clusters, and they coincide with current density patterns in international trade and technology alliances (EC 1997, pp. 10–11, 93).
3. Driing Factors Public spending on R&D is more intense in the USA than in the EU, while in terms of patenting, Japan has caught up with the USA, and the EU rates third (EC 1997a, pp. 53, 93). Economic globalization is a second driving force to reckon with, also uneven. With R&D investments concentrated in a few high-technology industries in the world, these lead strategic reconfigurations in S&T landscapes. Rapid advances in information and communication technologies interlock with economic development, providing new vehicles for rapid interaction (e.g., the Internet), and spurring further alliances and integration of knowledge in pre-competitive phases. Simultaneously, science is being drawn more deeply into the economic globalization process by policy responses, as countries
Science and Technology: Internationalization and whole regions (e.g., the EC through its Framework Programs) facilitate technological development for highly competitive world markets. The EC’s mobility schemes for graduate students and post docs must be seen in this light, and so also—by extension—events like the turn of the millennium announcement of a cooperative bilateral university-level agreement between two of the world’s most prestigious institutions, MIT (Massachussetts Institute of Technology, USA) and Cambridge University (UK), and the EC’s a new integrative policy for ‘the European Research Area.’ Such events herald a new phase in the intermeshing of the two processes, scientific internationalism and economic globalization. An ideological driving factor is the traditional cosmopolitan ethos inherent to academe. Associated with modernity, it incorporates the idea of progress, with a history parallel to the emergence of the nation state, democracy and secularization. This means it is in fact culturally bound, as are its CUDOS components enunciated by Robert Merton as the social glue of modern scientific institutions: intellectual Communism, Universalism, Disinterestedness, and Organized Skepticism (see Norms in Science). Recent deconstruction of such norms, standards and models of science by scholars in the newer sociology of science highlights the role of particularism, drawing attention to social mechanisms of negotiation between various actors, and how these shape or ‘stabilize’ scientific consensus that can never be final. This captures the cultural diversity of scientific practices, but not the basic ideological import of universalism as an ideal in scientific internationalism. A recent factor of internationalization resides in the global nature of environmental problems and the demand for concerted political action on a global scale to address them. Increasing numbers of treaties and conventions with strong science and technology components have been drawn up in this vein. On a converging track one finds pressures of escalating costs of large-scale facilities and calls to share budgetary burdens in newer as well as more traditional fields. The end of the Cold War was also a triggering factor. In its wake appear programs to aid Eastern and Central European scientists in their transition to capitalism and market-driven incentives, and new forms of international science and technology cooperation. Finally, there are the local pressures in smaller nations that push scientists in settings of relative isolation in all parts of the world to integrate their work more closely with research fronts. Here the benefits of internationalization and the added intellectual stimulation it entails sometimes has the makings of a self-fulfilling prophecy. Pressures to increase local visibility and gain international recognition by exposure to wider peer control of scientific quality get entrenched in procedures and new funding opportunities at national research councils and uni-
versities, affecting the behavior of individuals and groups of researchers, and success in getting grants.
4. Explanatory Models Traditional literature on the history of international scientific organizations usually distinguishes two types of fora, scientific non and intergovernmental organizations (scientific NGOs and scientific IGOs). These are taken to represent two different institutional mechanisms for fostering internationalization. In the innovation literature, on the other hand, the focus has mostly been on firms and their role in international diffusion of technologies. In general, there are two broad strands of theorizing about international organizations. One is rooted in the assumption that networks and more stable forms of organization arise in response to considerations of efficiency and rational goal-oriented behavior, in the course of which less viable alternative forms of interaction are circumvented. An economistic variant of this approach is implicit in an evolutionary theory of technological advance and diffusion. Here ‘market’ is taken as the selection mechanism that filters out a set of particular technologies from many potentially possible ones, and these get diffused across the globe (Nelson and Winter 1982). Explicitly or implicitly, the market is regarded as being determined by the tastes, preferences and purchasing power of potential technology users, who are treated as exogenous. Internationalization, in turn, becomes largely a unidirectional product of technological regimes or trajectories which cross and are taken up in socially constituted selective environments. Depending on similarities in the selection mechanisms, or ‘learning’ between national innovation systems, the transfer of ideas and technologies may be harmonized successively at a global level. Theoretical assumptions in the historiography of scientific organizations, or alternatively in regimetheory within the study of international relations, run somewhat parallel to this. Instead of the prominence attributed to an economic imperative, in these literatures ideological or political imperatives, and even communities of experts, may be foregrounded. One therefore gets the picture of internationalization primarily as the product of ideologically driven selfdirection by the scientific community (in the case of scientific NGOs) or of governments’ guiding hands, plus rules and experts (in the case of scientific IGOs). In all these accounts, the organizations in question are regarded more or less as mechanisms through which other agencies act. They are not depicted as purposive actors with an autonomy, power or culture of their own, even if regime-theory has been criticized for giving too much prominence to experts while 13635
Science and Technology: Internationalization obfuscating the role of the nation states that invest them with authority. In contrast to this, a sociological strand of theorizing takes its point of departure in Weber’s view of bureaucracies and focuses squarely on issues of legitimacy and power. Its advocates seek to explain a much broader range of impacts organizations can have, among others by virtue of their role in constructing actors, interests and social purpose. Emphasized are complexity, multiplicity and flexibility, with actors incorporated as endogenous to the processes of change. In this perspective it becomes interesting to consider how fora for international interaction, once they are created, take on a life of their own, exercising power autonomously in ways unintended and unanticipated by scientific associations or governments at the outset. The same can be said about conventions or regimes introduced to regulate and standardize intercourse between firms internationally in the fields of trade or intellectual property rights; norms are taken to have significant repercussions on the character of the interface between science and industry, and on the use of expertise in other realms. The constructivist approach associated with sociological institutionalism thus explains the emergence of the relatively autonomous powers of new international fora, in terms of the rational-legal authority they embody, emphasizing the new interests for the parties involved, and the concomitant learning process in which certain organizational models become diffused across the globe. New bureaucracies as they develop are seen to provide a generic cultural form that shapes the various forums in specific ways in their respective domains (firms, scientific NGOs, and scientific IGOs). New actors are seen to be created, responsibilities specified, and authority delineated, defining and binding the roles of both old and new actors, giving them meaning and normative values. In this model, culture, imagery, and rhetoric are held to be forceful ingredients in the life of international organizations, especially in the way these play out their roles in constructing social worlds with a global reach (Finnemore 1993, Barnett and Finnemore 1999). With the foregoing in mind, the next section highlights some empirical facets, with particular regard to the two most obvious institutional mechanisms (and associated with them, a few typically prominent actors and programs) pertinent to internationalization of—mostly academic, but also governmentally directed—science (as distinct from internationalization of R&D in industrial enterprise—for this see National Innoation Systems).
5. Two Institutional Mechanisms: Scientific NGOs and IGOs The numbers and influence of scientific NGOs and IGOs have grown tremendously since the 1980s. Now 13636
they not only find themselves interacting, but also pulled in different directions by lobbies of both transnational corporations (TNCs) driven by the profit motive, and nongovernmental civic society organizations (social movement NGOs). The latter are frequently fired by an ethic of equality and justice. Truth, politics, money, and human equality or justice, then, are the four ‘logics’ that meet in international forums, when strategies to tackle global problems are negotiated, e.g., Rio 1992; in the process new international groups of experts join the scene. This is an important field for future studies (Rayner and Malone 1998). Analytically, it is useful to draw a distinction between autoletic and heteroletic organizations, especially between ones meant to serve science as an end in itself, and ones that are created and sustained by governmental action (Elzinga and Landstro$ m 1996). This is parallel to the distinction between policy for science, and science for policy, serving to mark institutional separation of science from politics, an aspect central to arguments regarding the integrity and objectivity of expert knowledge under strain (see Scientific Controersies).
5.1 Scientific NGOs In general, nongovernmental mechanisms operate directly between research communities of different countries, without the intervening medium of governments. They are autoletic, the premise being that communities of scientists are best be left to themselves to organize their transnational contacts for common goals. A unique example is the International Council of Scientific Unions (ICSU), the umbrella organization that in 1931 sprang from an older cosmopolitan ideal (Greenaway 1991). It coordinated the Second International Polar Year (1932–3), the forerunner of a series of programs of global, often multidisciplinary studies that began in 1952 with the plan for the International Geophysical Year (1957). Present-day successors are the International Geosphere-Biosphere Program (IGBP) and the World Climate Research Program (WCRP). These cut across disciplines pertinent to research into global environmental problems. Each has several subprograms dealing with specific themes and aspects of global climate change. Other major programs cover biodiversity (DIVERSITAS) and the International Human Dimensions of Global Environmental Change program (IHDP), cosponsored with the International Social Science Council (ISSC), ICSU’s smaller sister organization for the social sciences, created in 1952. A milestone event is the World Conference on Science in Budapest (ICSU 1999), sponsored jointly with the United Nations Educational, Scientific and Cultural Organization (UNESCO), one of the world’s most wide-ranging intergovernmental organizations. Since
Science and Technology: Internationalization its foundation in 1945 UNESCO has worked constantly to link the peripheries to the centers of science.
5.2 Scientific IGOs Scientific IGOs typify the second (heteroletic) type of mechanism for internationalization. They promote scientific interchange via governmental channels. The point of departure is not scientific knowledge production as such, but the use of science for a particular purpose that form the basis for concerted action. In such contexts science is made a vehicle for the promotion of cultural, economic, political, and other goals at regional and global levels. UNESCO has already been mentioned. The World Meteorological Organization (WMO) and the World Health Organization (WHO) are two of many others within the UN family; in the domain of technology too there are many IGOs, some of which set and regulate standards. The World Bank is another significant actor. A recent addition is the Intergovernmental Panel on Climate Change (IPCC), which follows up on the science produced under the auspices of IGBP and WCRP, harmonizing research-based statements in advice to governments. This has important repercussions for scientific and technological pursuits. Some IGOs serve stakeholder interests in specific regions of the world, OECD being an example. Since 1963 it has been a pacesetter in developing R&D statistics, and science and technology policy doctrines, contributing to some harmonization between countries. A recent innovation was the creation of the Forum for Megascience (1992), renamed the Global Science Forum. It is responsible for periodical reviews of very large-scale projects so costly (in the order of billions of US dollars per year) and complex that they require multinational collaboration. Thence science and international diplomacy must meet to produce unique management cultures (Watkins 1997). Megaprojects can be concentrated at one site (e.g., CERN) or distributive (e.g., the Human Genome Project). Deep ocean drilling, climate change, thermonuclear fusion experimentation, as well as Antarctic research are further examples. Antarctica is a continent shaped by science as key vehicle in an international regime outside the UN (Elzinga, in Crawford et al. 1993).
6. Intermesh and Thematic Integration Proliferation and interpenetration of scientific NGOs and IGOs over past decades has gone hand-in-hand with a corresponding growth in numbers of civic NGOs and corporate lobbies. These also interact with scientific bodies, adding to the hybrid criss-cross of connections in which scientific and political agendas converge and blend. Leading scientists respond to the strain by constantly reemphasizing the need to protect
the objectivity and integrity of scientific knowledge claims (Greenaway 1991). Conflicts have emerged around attempts to commercialize and privatize national databases, with both ICSU and IGOs, like the WMO, coming out strongly in favor of a policy for open access to information that is invaluable for world research on global problems. Similar tensions have evolved over intellectual property in biotechnology, where scientists are more apt to be of two minds (see Intellectual Property, Concepts of). In the wake of globalization, in Third World countries in particular, both scientists and politicians have reacted against the design of Trade Related Intellectual Property Rights (TRIPS) as negotiated within the World Trade Organization (WTO). Around these and other issues, new tensions arise to pull scientists in several directions along the four different ‘logics’ (delineated at the beginning of Sect. 4). Thematic integration of research agendas is a second qualitative dimension needing further study. How problems are framed, data, as well as concept formation involves interpretation and epistemological imperatives (see Situated Knowledge: Feminist and Science and Technology Studies Perspecties). This is apparent in research on global climate change, where large international scientific programs (IGBP and WCRP) work hand-in-hand with IPCC to orchestrate problem sets, preferred methodologies and modeling criteria. Core sets of concepts spun in the world’s leading scientific countries with epistemic communities of climatologists around General Circulation Models (GMCs) have a bearing on the type of data field workers should look for, and the format in which the data are cast; special funding programs exist that enroll scientists from the scientific peripheries. Creating global consensus around a core of scientifically accepted knowledge and streamlining homogeneous accounts to spur concerted political action have epistemological implications beyond science. As activities, they contribute to the formation of common world views. Concepts such as ‘global warming potential’ (of greenhouse gases) help to build bridges between science and political decision-making. The very notion of ‘global climate,’ as well as ‘Earth system science’ are other examples where conceptual work facilitate cognitive integration, both over disciplinary boundaries in science and in interfaces with citizens at large (Jasanoff and Wynne, in Rayner and Malone 1998). They change our world picture in a reimagining of humankind in its encounters with nature (e.g., ideas like anthropogenic ‘fingerprints’). Representations of local climates, the atmosphere, and circulations systems in the oceans are linked up with representations of human ecology (e.g., land use), which in turn are linked to conceptions of risk and truth (see Risk, Sociology and Politics of). Internationalization of science resonates with gradual reorientation of perspectives and scientific practices in unitary fashion. To what extent alternatives to expertise as avenues of 13637
Science and Technology: Internationalization knowledge production are foreclosed remains a contentious issue. See also: History of Science; History of Technology; International Organization; International Science: Organizations and Associations; Science and Technology, Social Study of: Computers and Information Technology; Scientific Academies, History of; Universities and Science and Technology: Europe; Universities and Science and Technology: United States
Bibliography Barnett M, Finnemore M 1999 The politics, power, and pathologies of international organizations. International Organization 53: 699–732 Crawford C, Shinn T, So$ rlin S (eds.) 1993 Denationalizing Science. Kluwer, Dordrecht, The Netherlands European Commission 1997 Second European Report on S&T Indicators. Office for Official Publications of the European Communities, Luxembourg. Elzinga A, Landstro$ m C (eds.) 1996 Internationalism and Science. Taylor Graham, London Finnemore M 1993 International organizations as teachers of norms—the United Nations Educational and Cultural Organization and science policy. International Organization 47: 565–97 Greenaway F 1991 Science International: A History of the International Council of Scientific Unions. Cambridge University Press, Cambridge, UK Hicks D, Katz J S 1996 Where is science going? Science, Technology and Human Values 21: 379–406 ICSU 1999 Science International [Sept: Special Issue] ICSU Secretariat, Paris Nelson R, Winter S 1982 An Eolutionary Theory of Economic Change. Harvard University Press, Cambridge, MA Price D de S 1986 Little Science, Big Science and Beyond. Columbia University Press, New York Rayner S, Malone E (eds.) 1998 Human Choice and Climate Change. Batelle Press, Columbus, OH, vol. 1 Research Policy 1999 Special Issue 28, 107–36 (The internationalization of Industrial R&D) Schott S 1993 World science: Globalization of institutions and participation. Science, Technology and Human Values 18: 196–208 Schott T 1998 Between center and periphery in the scientific world system: Accumulation of rewards, dominance and selfreliance in the center. Journal of World-systems Research 4: 112–44 UNESCO 1998 World Science Report. Elsevier, Paris Watkins J D 1997 Science and technology in foreign affairs. Science 277: 650–1
A. Elzinga
collaborations between sociotechnical systems (STS) scholars and maverick computer scientists began. This passes lightly over some early important work critiquing automation, such as that of J. D. Bernal, the early days of the STS movement in Europe, particularly in England, Germany, and Scandinavia, and the neo Marxian analyses of labor process, many of which came to inform later STS work described below. The nexus of work that currently links STS with computer and information science is a very complex one, with roots in all the areas described below, and held together by a strong, invisible college. This group shares a common concern with how computers shape, and are shaped by, human action, at varying levels of scale. The links with STS include concerns about computer design within the social construction of technology; computers as an agent of social or organizational change; ethics and\or computing (Introna and Nissenbaum 2000); critical studies of computers and computer\information science; applied, activist, and policy research on issues such as the ‘digital divide,’ or the unequal distribution of computing and information technology across socioeconomic strata and regions of the world. Some contributions from this part of STS that have been used by scholars in many other parts of the field are Suchman’s ‘situated action’ perspective; Star’s ‘boundary objects’ (Star and Griesemer 1989); Forsythe’s methodological questions about ‘studying up’ and the politics of the anthropology of computing (1993); Henderson’s work on engineering drawings as ‘conscription devices’ (1999); Edwards’ work on computing and the Cold War, and its model of ‘closed and green’ worlds (1996); Berg’s critique of rationalization in medicine via computing (1997); Heath and Luff’s study of ‘centers of control’ via the technologies and interactions of the London Underground work force (2000); Woolgar’s program building and analytic work—much of it concerning the World Wide Web—from the Virtual Society? Program based at Brunel University (Grint and Woolgar 1997); Bowker’s concept of ‘infrastructural inversion,’ taken from information management at Schlumberger Corporation (1994); and Yates’ historical examination of information control techniques in American business (1989, 1996, 1999); Hanseth et al.’s work on standards and ethics in information technology (1996); and Bowker and Star’s work on large-scale classification schemes (1999).
Science and Technology, Social Study of: Computers and Information Technology
1. Automation and the Impact of Computerization
Roots and origins being elusive and often illusory, I will date this article from about 1980, when some key
Many early studies of computers and society, or computers and organizations, concerned computing
13638
Science and Technology, Social Study of: Computers and Information Technology as an automation of human work. This included two major analytical concerns: the replacement of humans by machines in the workplace, and the ‘deskilling’ of existing functional jobs (Braverman 1998). Later, the analytic picture became much richer, as researchers realized that automation and deskilling were just two dimensions of the picture. An early breakthrough article by two social-computer scientists defined this territory as ‘the web of computing’ (Kling and Scacchi 1982), a term which became a touchstone for the next generation. It refers to the co-construction of human work and machine work, in the context of complex organizations and their politics. Much of this research took place in business schools; some in departments of sociology or industrial psychology. In Scandinavia and other regions with strong labor unions, partnerships between the unions and researchers were important, at first focused on job replacement and deskilling, and later, on design created through partnerships with users, social scientists, and computer designers (discussed below). The social impact of computing remains a strong research strain worldwide, despite the anthropological fact that finding ‘untouched tribes’ is increasingly difficult. Thus the problematics of interest to STS have shifted from initial impact on a group or institution, to understanding ongoing dynamics such as usage, local tailoring of systems, shifts in design approaches, understanding the role of infrastructure in helping to form impacts, and the ecology of computers, paper, telephones, fax machines, face-to-face communication, and so forth commonly found in most offices, and increasingly in homes and schools.
2. The Emulation of Human Cognition and Action (Artificial Intelligence and Robotics) The early program of artificial intelligence (AI) work, beginning in an organized way during World War II, was to create a machine that would emulate human thinking (classic AI) and action (robotics). Some of the emulation work came in the form of intelligent tutoring systems and expert systems (Wenger 1987) meant to replace or supplement human decision making. The decision-making side of the research continues to be strong in management schools and in government agencies; it is often critiqued by STS researchers, although some do use these tools (Gilbert and Heath 1985, Byrd et al. 1992). The discussion about whether emulation of human thinking would be possible dates far back; it came to the attention of many STS researchers with the work of philosopher Hubert Dreyfus (1979, 1992), who argued for the irreducibility of human thought and therefore its impossibility in computers. Later STS researchers (including Forsythe, Star, and Suchman) formed direct collaborations with AI researchers. These part-
nerships took both critical and system-building forms; in both cases, the social science researchers acted as informants about the nature of the ‘real world’ as opposed to the emulated world of AI. AI changed during the late 1980s and early 1990s—some computer scientists spoke of the ‘AI winter,’ referring to funding problems unknown in the 1970–88 period. As personal computing and e-mail spread, and later, the Web (1994 ), and the early promises of AI seemed not to be paying off, AI began to lose its prestige within computer science. Many AI researchers changed to problems in information science, software engineering, or cognitive science. A branch of AI, distributed artificial intelligence, continued to interact with the STS community (Huhns 1987, Huhns and Gasser 1989). Their interests were in modeling and supporting spatially and temporally distributed work and decision practices, often in applied settings. This reflected and bridged to STS concerns with community problemsolving, communication and translation issues, and the division of labor in large scientific projects.
3. The Enterprise of Computing, Its Military Roots and the Role of Actiism In 1983 US President Ronald Reagan introduced his (in)famous ‘Star Wars’ (officially known as the Strategic Defense Initiative, or SDI) military defense program, a massively expensive attempt to create a distributed laser and missile-based system in space that would protect the USA from foreign attack. The proposal put computer scientists, and especially software engineers, at the center of the design effort. An immediate outcry was raised by the computing community as well as by alarmed STS scholars close to it. There were concerns that a system of that magnitude was untestable; tools were lacking even to emulate the testing. There were concerns about its viability on other grounds. There were also concerns about its ecological impact on the field of computer science (akin to those raised by organismal biologists about the Human Genome Project)—that the lion’s share of research funds would be funneled into this project, orphaning others. A new group, Computer Scientists for Social Responsibility (CPSR) (www.cpsr.org), was formed in Silicon Valley as a focus for these concerns. The group quickly found common ground in many areas, including activist, ethical, policy and intellectual questions. It has flourished and now sponsors or cosponsors three conferences a year in which STS scholars often participate: Computers, Freedom and Privacy (policy); Directions in Advanced Computing (DIAC) (intellectual and design directions, and their political and ethical bases and outcomes); and the Participatory Design Conference (PDC) (brings together users, community organizers, computer and social scientists to work on issues of co-design and 13639
Science and Technology, Social Study of: Computers and Information Technology appropriate technology) (Kyng and Mathiassen 1997). It, like its counterparts in the UK (Computers and Social Responsibility) and elsewhere, became an important meeting ground between STS and computer\ information science. This was especially true of those critical of military sponsorship of computing agendas (at one point in the US some 98 percent of all funding of computer science came from the military, especially ARPA (formerly DARPA)—the Advanced Research Projects of the Army, arguably the critical actor in the creation of the internet (Abbate 1999). Beginning in the 1980s, a grassroots movement sometimes called ‘community computing’ arose, and linked with CPSR and other similar organizations. It attempted to increase access to computing for poor people and those disenfranchised from computing for a variety of reasons (Bishop 2000). This often involved the establishment of ‘freenets,’ or publicly available free computing. Terminals were placed in venues such as public libraries, churches, and after-school clubs; some were distributed directly to people in need. This general problematic has come to be called the ‘digital divide,’ and forms an active nexus of research across many areas of interest to STS. An important part of the attempt to enfranchise everyone arose in the world of feminist activism and scholarship, as early computer hackers were nearly all male, and there were sexist and structural barriers to women as both users and professionals. Another meeting place between STS and computer\information science was formed by numerous conferences on women in computer science, gender and computing, and feminist analyses of the problem choice and ethics of computer science. An excellent series of conferences, whose proceedings are published in book form every four years, is sponsored by IFIPS (the International Federation of Information Processing Societies, Working Group 9.1, entitled Women, Work and Computing). See, for example, Grundy et al. (1997). There is a resurgence of interest in this topic with the advent of the Web (see, e.g., Wakeford 1999).
(1996) looked at issues such as the culture of programming, its practices, and the role of technicians (Barley and Orr 1997). Some, such as Nardi, conducted usability tests for prototypes of systems. Sociologists examined work practices, organizational processes, and informed the computer scientists about the ways in which these issues could shape or block design (Star and Ruhleder 1996).
4. Design
5.2 Computer-supported Cooperatie Work (CSCW) and the Participatory Design Moement (PD)
Beginning in the early 1980s, but reaching full strength in the late 1980s, a number of STS and STS-linked scholars began studying information technology design. This took two forms: studying the work of doing design, and doing design as part of a multidisciplinary team. Henderson’s work (cited above) on the visual design practices of engineers is a good example of the former; other important works in the area include Kunda (1992); and a special issue on Design of the journal Computer Supported Collaboratie Work (Volume 5:4, 1996). STS scholars participating directly in design did so in a number of ways. Anthropologists (such as Nardi 1993, Nardi and O’Day 1999) and Orr 13640
5. From ‘The Unnamable’ to Social Informatics: Shaping an Inisible College Linked to STS An invisible college of social scientists, including STS researchers, who do–study–critique computer and information science has steadily grown since the early 1980s. A brief overview of some of the ‘sister travelers’ now growing in strength in STS itself includes the following.
5.1 Organization Theory and Analysis People who study organizations have for some time been concerned with the use and impact of computing within them. During the 1990s, STS has become an increasingly important theoretical resource for this field, and has resulted in a number of organization science conferences inviting keynote addresses by STS scholars; reading STS work and attempting to apply it in organizational design and studies. The work of Latour and actor network theory has been especially important (Orlikowski et al. 1995). The policy area of STS has also become increasingly important where issues of privacy, employee rights, intellectual property, and risk are related to computer and information systems (regular updated reports and references can be found on Phil Agre’s ‘Red Rock Eater’ electronic news service: http:\\dlis.gseis.ucla.edu\people\pagre\rre. html).
Cognitive and experimental psychologists have been involved with the design of computer and information systems since at least the 1950s. They formed a field that came to be known as Human–Computer Interaction (HCI). This field focused originally on very small-scale actions, such as measuring keyboard strokes, attention and interface design, ergonomics of workstations, and usability at the level of the individual user. In the late 1980s, part of this field began to stretch out to more organizational and cultural issues (Grudin 1990, Bannon 1990). The impact of the
Science and Technology, Social Study of: Computers and Information Technology personal computer and the consequent decentralization of computing, and the rapid spread of networked computing to non-computer specialists were important factors. This wing of HCI began to draw together computer scientists, sociologists, anthropologists, and systems analysts of like mind. In 1986 there was a search to put a name to this. Two of the major competitors were ‘office information systems’ (to replace the old ‘office automation’), and ‘computersupported cooperative work (CSCW),’ which won out. CSCW is an interdisciplinary and international group that studies a range of issues including the nature of cooperation, critiques of computer science, building systems to support cooperative work (both local and high distributed). STS scholars have been part of this field since the beginning; Star was a founding co-editor of the journal. Her work and that of Suchman’s have been widely used in CSCW. Closely linked with CSCW is PD, the practice of designing computer systems with user communities and social informaticians. Annual conferences are held in tandem with CSCW. They have picked up and actively use STS approaches. The roots of PD have been reviewed recently by Asarco (2000); they include corporate initiatives, community development, social movements, and the Scandinavian School, discussed in the next section.
5.3 The ‘Scandinaian School’ In the 1950s the (powerful) trade unions in Scandinavia helped to pass the ‘codetermination legislation.’ This law stated that unions must be involved in technological design—originally motivated by concerns about deskilling and job loss through automation. A form of sociotechnical systems analysis evolved into a set of techniques for studying work places and processes. As computers arrived in the workplace, this came to include progressive computer scientists and social scientists, many of whom now participate in STS publishing and conferences, as well as CSCW and PD (Greenbaum and Kyng 1991, Bjerknes et al. 1987, Bødker 1991, Neumann and Star 1996).
5.4 Computers and Education The use of computers for science education began in the 1960s. The study of this, based in schools of education and science departments, has had both critical components and basic research. In recent years, the addition of distance education via computers, internet classes, and the ubiquity of computers in college classes has become a critical component of this community. One branch is called ComputerSupported Cooperative Learning (CSCL), and has a
lively relationship with both STS and CSCW (Koschmann 1996). STS concepts are beginning to be used across all areas of science education, and recent STS conferences reflect the links strengthening between science education and STS.
5.5 Former Library Schools as an Emergent Nexus for STS Work Since the 1980s, schools of library science have experienced massive closures, due to a complex of corporatization, declining public sphere funding for public libraries, and a general move to close professional schools whose faculty and students are mostly women (social work and physical therapy schools have met similar fates). In the USA, a few of the surviving library schools have reinvented themselves as new ‘Schools of Information,’ most dropping the word ‘library’ from the title. Sites include the University of Michigan, University of Illinois at Urbana-Champaign, University of North Carolina, and Indiana University. Their faculty now includes social scientists, computer and information scientists, and library scientists. STS work is central to many of the programs, and several faculty members from the STS world who work with information technology or information itself have found positions there. The structural shape of the programs differs in countries with different forms of funding, and different configurations of the public sphere. Kling’s ‘Social Informatics’ page is an excellent resource (http:\\www.slis. indiana.edu\SI\), as is Myers’ ‘Qualitative Research in Information Science’ (http:\\www.auckland.ac. nz\msis\isworld\) (see also Kling 2000). However, STS work is still very influential, for example, at the Royal School of Librarianship in Copenhagen.
6. The Web: Cultural Studies, Economic Impact, Social Practices, and Ethics The public availability of the World Wide Web from 1994 and the commercialization of the Internet–Web, combined with falling prices for computers with good graphics and cheap memory, has changed the face of computing dramatically. Many STS scholars have been involved in different facets of studying the Web. Cultural studies of chat rooms, home pages, MOOs and MUDs, and social inequities in distribution and use have exploded (see e.g., Turkle 1995). Some of this is adapted by STS scholars as studying the use and development of technology; some as a lens through which technology mediates culture. E-commerce, including scientific publishing, has come to overlap with STS work (e.g., Knorr-Cetina and Preda 1998). Ethical areas, such as privacy, electronic stalking and harassment, hate speech sites, and identification of 13641
Science and Technology, Social Study of: Computers and Information Technology minors on the Web (which are protected by human subjects regulations, but impossible to identify in many e-venues) are among these issues (Friedman and Nissenbaum 1996).
7. Challenges and Future Directions One of the difficult things for STS scholars who work in this area is the process of tacking back and forth between sympathetic communities housed in information and computer science, STS itself, and one’s home discipline. This appears at many levels: job openings, publications, choosing conferences to attend, and the growing call from within industry and computer science for ethnographers, social science evaluators, and co-designers. Sometimes this latter call takes the form of guest talks to technical groups, keynote addresses at technical meetings, consulting, or being asked for free advice about complex social problems within an organization. As with STS itself, there are a growing number of PhD programs within social informatics, broadly speaking, and the field is both spreading and converging. This may, in the long run, ease the juggling problem. At present, most researchers in the area pursue a double career, publishing and participating in both computer\IS and STS communities, with some bridge-building back to the home disciplines. Another challenge lies in the area of methods. Many social informatics\STS researchers inherited methodological practices from their home disciplines, and some of these make an uneasy fit with current technological directions. For example, there is now a great deal of survey research and of ‘ethnography’ being done on the Web, including intense study of chat rooms, e-discussion lists, and other forms of online behavior. On the survey side, sampling and validity problems loom large, although they are old problems with venerable methods literatures addressing their solution. The question of sampling only from those with e-mail hookups is similar to the old question of sampling only those who have telephones. Every sample has limits. However, the validity issue, such as the practices of filling out e-mail forms, how the survey appears in the context of many other e-mail messages, and the impact of the genre itself on the content is only now beginning to be explored. Similarly, for ethnographers, it is clear that e-mail messages are not the readymade fieldnotes that many early studies (i.e., from the early 1990s) delighted in. Forms of triangulation between online and offline research are now coming to the fore in sophisticated methods discussions; another contribution lies in the socially little-known interaction between the built information environment and the phenomenology of users (Wakeford and Lyman 1999). In conclusion, growing overlaps between complex computer\information science groups, STS, and the 13642
other social worlds listed above means new opportunities for STS. STS students are finding employment in corporate research and development (R&D) settings, new information schools, government policy employers concerned about information technology, and in the non-profit sector concerned with issues such as the digital divide. A note on access: much of this sort of material exists in the proceedings of conferences and is indexed by corporate or organizational authors. It is, in libraries, usually held in special Computer Science libraries on campus, and is, by social science standards, badly indexed. Seek help from the reference librarian to find this material. Working groups or special interest groups (SIGs in computer science terminology) can be powerful loci of change, with thousands of members, both professional and academic. Some are of direct interest to STS scholars, such as SIGSOC, the Special Interest Group on Computers and Society run by the ACM (Association for Computer Machinery, the dominant professional organization for computer scientists in the USA) or ACM- SIGCHI (SIG on Computers and Human Interaction), which has grown so large it is now functionally its own professional organization. Although all professions have jargon and acronyms, computer and information science is highly dense with them. Scholars seeking to build bridges to the computer\information science community can consult many online ‘acronym servers’ such as Free On-Line Dictionary of Computing at http:\\foldoc.doc.ic.ac.uk\foldoc\index.html; acronym finder at http:\\www.acronymfinder.com\.
8. Selected Journal Resources Publishing Some STS Work Accounting, Management and Information Technology (which, in 2000, was due to change its name to Information and Organization) Organization Science Computer Supported Cooperatie Work (CSCW): The Journal of Collaboratie Computing Information Systems Research The Scandinaian Journal of Information Science CSCL Learning Sciences Journal of the American Society of Information Sciences Human–Computer Interaction The Information Society Information Technology and People CMC Magazine Electronic Journal on Virtual Culture: http:\\www.monash.edu.au\journals\ejvc\ See also: Artificial Intelligence in Cognitive Science; Communication: Electronic Networks and Pub-
Science and Technology, Social Study of: Computers and Information Technology lications; Communication: Philosophical Aspects; Computers and Society; Digital Computer: Impact on the Social Sciences; Human–Computer Interaction; Human–Computer Interface; Information and Knowledge: Organizational; Information Society; Information Technology; Information Theory; Mass Communication:Technology;ScienceandTechnology: Internationalization; Telecommunications and Information Policy
Bibliography Abbate J 1999 Inenting the Internet. MIT Press, Cambridge, MA Asarco P M 2000 Transforming society by transforming technology: The science and politics of participatory design. Accounting, Management and Information Technologies 10: 257–90 Bannon L 1990 A pilgrim’s progress: From cognitive science to cooperative design. AI and Society 4(2): 59–75 Barley S, Orr J (eds.) 1997 Between Craft and Science: Technical Work in US Settings. IRL Press, Ithaca, NY Berg M 1997 Rationalizing Medical Work: Decision-support Techniques and Medical Practices. MIT Press, Cambridge, MA Bishop A P 2000 Technology literacy in low-income communities. (Re\mediating adolescent literacies). Journal of Adolescent and Adult Literacy 43: 473–76 Bjerknes G, Ehn P, Kyng M (eds.) 1987 Computers and Democracy: A Scandinaian Challenge. Avebury, Aldershot, UK Bødker S 1991 Through the Interface: A Human Actiity Approach to User Interface Design. Erlbaum, Hillsdale, NJ Bowker G 1994 Information mythology and infrastructure. In: Bud-Frierman L (ed.) Information Acumen: The Understanding and Use of Knowledge in Modern Business. Routledge, London, pp. 231–47 Bowker G, Star S L 1999 Sorting Things Out: Classification and its Consequences. MIT Press, Cambridge, MA Bowker G, Star S L, Turner W, Gasser L (eds.) 1997 Social Science, Information Systems and Cooperatie Work: Beyond the Great Diide. Erlbaum, Mahwah, NJ Braverman H 1998 Labor and monopoly capital: The degradation of work in the twentieth century. 25th Anniversary edn. Monthly Reiew Press, NY Byrd T A, Cossick K L, Zmud R W 1992 A synthesis of research on requirements analysis and knowledge acquisition techniques. MIS Quarterly 16: 117–39 Dreyfus H L 1979 What Computers Can’t Do: The Limits of Artificial Intelligence, Revd. edn. Harper & Row, New York Dreyfus H L 1992 What Computers Still Can’t Do: A Critique of Artificial Reason. MIT Press, Cambridge, MA Edwards P N 1996 The Closed World: Computers and the Politics of Discourse in Cold War America. MIT Press, Cambridge, MA Forsythe D E 1992 Blaming the user in medical informatics: The cultural nature of scientific practice. Knowledge and society. The Anthropology of Science and Technology 9: 95–111 Forsythe D E 1993 The construction of work in artificial intelligence. Science, Technology, and Human Values 18: 460–80
Friedman B, Nissenbaum H 1996 Bias in computer systems. ACM Transactions on Information Systems 14: 330–48 Gilbert N, Heath C (eds.) 1985 Social Action and Artificial Intelligence. Gower, Aldershot, UK Greenbaum J, Kyng M 1991 Design at Work: Cooperatie Design of Computer Systems. Erlbaum, Hillsdale, NJ Grint K, Woolgar S 1997 The Machine at Work: Technology, Work, and Organization. Polity Press, Cambridge, UK Grudin J 1990 The computer reaches out: The historical continuity of interface design. In: Chew J, Whiteside J (eds.) Proceedings of the CHI ’90 Conference on Human Factors in Computing Systems, April 1–5; ACM Press, Seattle, WA, pp. 261–68 Grundy A F et al. (eds.) 1997 Spinning a web from past to future. Proceedings of the 6th International IFIP Conference. Bonn, Germany, May 24–27, Springer, Berlin Heath C, Luff P 2000 Technology in Action. Cambridge University Press, New York Hanseth O, Monteiro E, Hatling M 1996 Developing information infrastructure: The tension between standardization and flexibility. Science, Technology, and Human Values 21: 407–27 Henderson K 1999 On Line and on Paper: Visual Representations, Visual Culture, and Computer Graphics in Design Engineering. MIT Press, Cambridge, MA Huhns M (ed.) 1987 Distributed Artificial Intelligence. Morgan Kaufmann, Los Altos, CA Introna L, Nissenbaum H 2000 The politics of search engines. IEEE Spectrum 37: 26–28 Kling R 2000 Learning about information technologies and social change: The contribution of social informatics. The Information Society 16: 21–232 Kling R, Scacchi W 1982 The web of computing: Computing technology as social organization. Adances in Computers 21: 3–78 Knorr-Cetina K D, Preda A 1998 The epistemization of economic transactions. In: Sales A, Adikhari K (eds.) Knowledge, Economy and Society. Sage, New York Koschmann T (ed.) 1996 CSCL, Theory and Practice of an Emerging Paradigm. Erlbaum, Mahwah, NJ Kunda G 1992 Engineering Culture: Control and Commitment in a High-tech Corporation. Temple University Press, Philadelphia, PA Kyng M, Mathiassen L (eds.) 1997 Computers and Design in Context. MIT Press, Cambridge, MA Luff P, Hindmarsh J, Heath C (eds.) 2000 Workplace Studies: Recoering Work Practice and Informing System Design. Cambridge University Press, New York Nardi B A 1993 A Small Matter of Programming: Perspecties on End User Computing. MIT Press, Cambridge, MA Nardi B A, O’Day V 1999 Information Ecologies: Using Technology with Heart. MIT Press, Cambridge, MA Neumann L, Star S L 1996 Making infrastructure: The dream of a common language. In: Blomberg J, Kensing F, DykstraErickson E (eds.) Proceedings of PDC ’96 (Participatory Design Conference). Computer Professionals for Social Responsibility, Palo Alto, CA, pp. 231–40 Orlikowski W, Walsham G, Jones M, DeGross J (eds.) 1995 Information Technology and Changes in Organizational Work. Proceedings of IFIP WG8.2 Conference, Cambridge, UK. Chapman and Hall, London Orr J E 1996 Talking About Machines: An Ethnography of a Modern Job. ILR Press, New York Star S L 1988 The structure of ill-structured solutions: Heterogeneous problem-solving, boundary objects and distri-
13643
Science and Technology, Social Study of: Computers and Information Technology buted artificial intelligence. In: Huhns M, Gasser L (eds.) Distributed Artificial Intelligence 2. Morgan Kauffmann, Menlo Park, CA, pp. 37–54 Star S L, Griesemer J 1989 Institutional ecology, ‘translations,’ and boundary objects: Amateurs and professionals in Berkeley’s Museum of Vertebrate Zoology, 1907–1939, Social Studies of Science, 19: 387–420. (Reprinted in Mario Biagioli (ed.) The Science Studies Reader. Routledge, London, pp. 505–24) Star S L, Ruhleder K 1996 Steps toward an ecology of infrastructure: Design and access for large information spaces. Information Systems Research 7: 111–34 Turkle S 1995 Life on the Screen: Identity in the Age of the Internet. Simon & Schuster, New York Wakeford N, Lyman P 1999 Going into the (virtual) field. American Behaioral Scientist 43: Wakeford N 1999 Gender and the landscapes of computing at an internet cafe! . In: Crang P, Dey J (eds.) Virtual Geographies. Routledge, London Wenger E 1987 Artificial Intelligence and Tutoring Systems: Computational and Cognitie Approaches to the Communication of Knowledge. Morgan Kaufmann, Los Altos, CA Yates J 1989 Control Through Communication: The Rise of System American Management. Johns Hopkins University Press, Baltimore, MD Yates J 1996 Exploring the black box: Technology, economics, and history. Technology and Culture 37: 61–620 Yates J 1999 Accounting for growth: Information systems and the creation of the large corporation. Journal of Economic History 59: 540–2
S. L. Star
Science and Technology Studies: Ethnomethodology Ethnomethodology is a sociological approach to the study of practical actions which has influenced the development of constructionist, discourse analytic, and related approaches in science and technology studies (S&TS). Early ethnomethodological studies of ordinary activities and social scientific research practices developed an orientation to local practices, situated knowledge, and concrete discourse which later became prominent in science and technology studies. In addition to being a precursor to S&TS, ethnomethodology continues to offer a distinctive approach to practical actions in science and mathematics which rivals more familiar versions of social constructionism.
1. Ethnomethodological Research Policies In the 1960s, Harold Garfinkel coined the term ethnomethodology as a name for a unique sociological approach to practical actions and practical reasoning (Garfinkel 1974, Heritage 1984, p. 45, Lynch 1993, pp. 13644
3–10). Garfinkel was influenced by existential phenomenology (especially Schutz 1962) and sociological theories of action (especially Parsons 1937). As usually defined, ethnomethodology is the investigation of ‘folk methods’ for producing the innumerable practical and communicative actions which constitute a society’s form of life. 1.1
Ethnomethodological Indifference
Unlike methodologists in the philosophy of science who accord special status to scientific methods, ethnomethodologists examine methods for composing and coordinating ordinary as well as scientific activities. Garfinkel deemed all methods to be worthy of detailed study. This research policy is known as ethnomethodological indifference (Garfinkel and Sacks 1970, pp. 345–6, Lynch 1993, pp. 141–2). According to this policy, any method is worthy of study, regardless of its professional status, relative importance, adequacy, credibility, value, and necessity. This does not mean that ethnomethodologists treat all methods as equally ‘good’; instead, it means that they do not believe that preconceptions about the importance and validity of particular methods should determine a choice of research topic. Ethnomethodological indifference is similar in some respects to the more familiar policies of symmetry and impartiality in the Strong Programme in the sociology of scientific knowledge (Bloor 1976, pp. 4–5). However, there is a significant difference between the two, which has to do with a less obvious implication of the term ‘ethnomethodology.’ In addition to being a name for an academic field that studies ‘folk methods,’ the term refers to ‘methodologies’—systematic inquiries and knowledge about methods—which are internal to the practices studied. Many social scientists (including many in S&TS) assume that the persons they study perform their activities unreflexively or even unconsciously, and that the intervention of an outside analyst is necessary for explicating, explaining, and criticizing the tacit epistemologies that underlie such activities. In contrast, ethnomethodological indifference extends to the privileges ascribed to social scientific analysis and criticism: Persons doing ethnomethodological studies can ‘care’ no more or less about professional sociological reasoning than they can ‘care’ about the practices of legal reasoning, conversational reasoning, divinational reasoning, psychiatric reasoning, and the rest. (Garfinkel and Sacks 1970, p. 142)
Perhaps more obviously than other domains of practice, scientific research involves extensive methodological inquiry and debate. Science is not alone in this respect. In modern (and also many ancient) societies, large bodies of literature articulate, discuss, and debate the practical, epistemic, and ethical character of a broad array of activities, including legal procedures,
Science and Technology Studies: Ethnomethodology food preparation, dining, sexuality, child rearing, gardening, and fly fishing. Any effort to explain such practices sociologically must first come to terms with the fact that explanations of many kinds (including social explanations) are embedded reflexively in the history, production, and teaching of the activities themselves.
1.2 Ethnomethodological Reflexiity Garfinkel (1967, p. 1) spoke of the ‘reflexive’ or ‘incarnate’ character of accounting practices and accounts. By this he meant that social activities include endogenous practices for displaying, observing, recording, and certifying their regular, normative, and ‘rational’ properties. In other words, social agents do not just act in orderly ways, they examine, record, and reflexively monitor their practices. This particular sense of ‘reflexivity’ goes beyond the humanistic idea that individuals ‘reflect’ on their own situations when they act, because it emphasizes collective, and often highly organized, accounting practices. The fact that the persons and organized groups that social scientists study already observe, describe, and interpret their own methodic practices can provoke challenges to the authority of social science methods when the latter conflict with native accounts. Whether or not a social scientist agrees with, for example, official accounts of scientific method given by the subjects of an ethnographic or historical investigation, it is necessary to pay attention to the more pervasive way in which methodic understandings, and understandings of method, play a constitutive role in the practices under study. For ethnomethodologists, the research question is not just ‘How do a society’s (or scientific field’s) members formulate accounts of their social world?,’ but ‘How do members perform actions so as to make them account-able; that is, observable and reportable in a public domain of practice?’ Accordingly, accounts are not simply interpretations made after actions take place; instead, actions display their accountability for others who are in a position to witness, interpret, and record them.
1.3
Topic and Resource
The idea that explanations, explanatory concepts, and, more generally, natural language resources are common to professional sociology and the activities sociologists study is a source of long-standing consternation and confusion in the social sciences (Winch 1990, [1958]). As Garfinkel and Sacks (1970, p. 337) noted: The fact that natural language serves persons doing sociology, laymen or professionals, as circumstances, as topics, and as
resources of their inquiries furnishes to the technology of their inquiries and to their practical sociological reasoning its circumstances, its topics, and its resources.
An injunction developed by ethnomethodologists (Zimmerman and Pollner 1970), and adopted by proponents of discourse analysis (Gilbert and Mulkay 1984) is that social analysts should not confuse topic and resource. That is, they should not confuse the common sense explanations given by participants in the social activities studied with the analytic resources for studying those same activities. A variant of this injunction in social studies of science warns the analyst not to adopt, for polemical purposes, the very vocabulary of scientific authority that social studies of science make problematic (Woolgar 1981, Ashmore 1989). While this may be good advice in particular cases, when taken as a general policy the injunction not to confuse topic and resource may seem to encourage a retreat from the very possibility of describing or explaining, let alone criticizing, the practices in question. If every possible analytic resource is already found in the social world as a problematic social phenomenon, then what would a sociologist have to say that is not already incorporated into the practices and disputes studied? While acknowledging that this is a problem for any sociological explanation of science, Latour (1986) recommends a ‘semiotic turn’ which would examine the terms of the (scientific) tribe from the vantage point provided by an abstract theoretical vocabulary. But, unless one grants special epistemic status to semiotics, this ‘turn’ simply reiterates the problem.
2. Ethnomethodology and the Problem of Description Ethnomethodologists do not agree upon a single solution to the problem of reflexivity, and contradictory ways of handling it are evident in the field, but some ethnomethodologists believe that sociological descriptions do not require special theoretical or analytical auspices in order to overcome reflexivity. As Sharrock and Anderson (1991, p. 51) argue, under most circumstances in which descriptions are made and accepted as adequate, there is no need to overcome epistemological scepticism. Ethnomethodologists treat reflexivity as ubiquitous, unavoidable, and thus ‘irremediable.’ Consequently, reflexivity is not a ‘methodological horror’ (Woolgar 1988, p. 32) that makes description or explanation impossible or essentially problematic, because an ethnomethodologist is in no worse (or better) shape than anyone else who aims to write intelligible, cogent, and insightful descriptions of particular states of affairs. Even so, there still remains the question of the scientific, or other, grounds of ethnomethodological descriptions. 13645
Science and Technology Studies: Ethnomethodology 2.1 The Possibility of ‘Scientific’ Descriptions of Human Behaior A possible basis for ethnomethodological descriptions is presented in an early argument written by Sacks (1992, pp. 802–5), the founder of a field that later came to be called conversation analysis. This was a brief, but intriguing, argument about the possibility of developing a science that produces stable, reproducible, naturalistic accounts of human behavior. Sacks observed that natural scientists already produce accounts of human behavior when they write natural language descriptions of methods for reproducing observations and experiments. Social scientists who attempt to describe ‘methods’ in a broad range of professional and non-professional activities are in no worse position than natural scientists who attempt to describe their own particular methods. In other words, Sacks conceived of the replication of experiments as a particular (but not necessarily special) case of the reproduction of social structures. While social scientists do not aim to produce ‘how to’ manuals, they can write accounts of social actions which are ‘true’ in a praxiological sense: adequate to the production and reproduction of the relevant practices.
producing standard laboratory protocols (Garfinkel 1986, 1991, Garfinkel et al. 1981, Livingston 1986, Suchman 1987). 2.3 Ethnomethodology and Social Constructionism Ethnomethodology has an ambivalent relation to social constructionism. Early laboratory studies (Latour and Woolgar 1979, Knorr 1981) integrated selected themes from ethnomethodology into constructionist arguments, but many ethnomethodologists prefer to speak of the ‘production’ rather than the ‘construction’ of social orders (Lynch 1993, Button and Sharrock 1993). While it may be so that no fact is ever free of ‘construction’ in the broadest sense of the word, local uses of the term in empirical science (though not in mathematics and certain branches of theory) refer to research artifacts which are distinguished from a field of natural objects (Lynch 1985). Ethnomethodologists prefer not to speak indiscriminately of the construction or manufacture of knowledge, in order to preserve an ‘indifferent’ orientation to the way distinctions between constructed and unconstructed realities are employed in practical action and situated argument.
2.2 Replication and the Reproduction of Social Structure
2.4 Normatie and Ethical Considerations
It is well established in the sociology of science that replication is problematic. Instead of being a methodological foundation for establishing natural facts, particular instances of experimental replication often beg further questions about the detailed conditions and competencies they involve (Collins 1985). To say that replication is problematic does not imply that it is impossible, but it does raise the question of how scientists manage to secure assent to the adequacy of their experiments. The phenomenon of just how scientists conduct experiments, and how they establish particular results in the relevant communities, has become a major topic for socio-historians and ethnographers of science (Gooding et al. 1989). Consequently, in the more general case of the reproduction of social structure, it seems reasonable to conclude that Sacks identifies a topic for ethnomethodological research, but not a grounding for a possible sociological research program (Lynch and Bogen 1994). Ethnomethodological studies of the reproduction of order in science, other professions, and daily life, address the topic of the production of instructed actions. Instructed actions include an open-ended variety of actions performed in accordance with rules, plans, recipes, methods, programs, guidelines, maps, models, sets of instructions, and other formal structures. In addition to examining efforts to reproduce initial observations, ethnomethodologists have studied a series of topics: deriving mathematical proofs, following instructions for photocopying, and re-
Ethnomethodology has been criticized for treating normative aspects of the activities studied as ‘mere phenomena’ (Habermas 1984, p. 106). For example, when studying a contentious court case (Goodwin 1994) or jury deliberation (Maynard and Manzo 1993), an ethnomethodologist does not assess the discursive practices by reference to ideal standards of validity, rationality, or justice. This does not suppose that normative considerations are ‘mere’ phenomena. Instead, it supposes that peoples’ methods (the phenomena studied by ethnomethodologists) are just that: situated actions that, for better or worse, incorporate normative judgments and ethical claims. Instead of recasting such judgments and claims in terms of one or another transcendent framework, ethnomethodologists attempt to explicate the way normative judgments are produced, addressed, and fought over in specific circumstances. When ethnomethodologists investigate highly charged uses of, and appeals to, normative judgment, they enable readers to examine specific configurations of action and reasoning that do not neatly fall under recipe versions of norms and values. Consequently, rather than invoking or developing a normative theory, ethnomethodologists invite their readers to consider intricate situations of practice and contestation that no single general framework can possibly forecast or resolve. Practical and ethical dilemmas are no less salient for ethnomethodologiststhanforculturalanthropologists,
13646
Science and Technology Studies: Experts and Expertise lawyers, and other participant-investigators. The policy of ethnomethodological indifference does not relieve an investigator of ethical choices and responsibilities. Like other investigators, ethnomethodologists may in some circumstances find it advisable to respect norms of privacy and propriety, while in others they may feel compelled to expose wrongdoing. However, as a body of doctrines, research policies, and exemplary studies, ethnomethodology does not supply a set of rules or ethical guidelines for making such difficult choices. This does not mean that ethnomethodologists proceed without ethics, but that their ethical judgments, like many of their other judgments, have a basis in communal life that is not encapsulated by any academic school, theory, or method. See also: Ethnology; Ethnomethodology: General; Parsons, Talcott (1902–79); Reflexivity in Anthropology; Reflexivity: Method and Evidence
Bibliography Ashmore M 1989 The Reflexie Thesis: Wrighting the Sociology of Scientific Knowledge. University of Chicago Press, Chicago Bloor D 1976 Knowledge and Social Imagery. Routledge and Kegan Paul, London Button G (ed.) 1991 Ethnomethodology and the Human Sciences. Cambridge University Press, Cambridge, UK Button G, Sharrock W 1993 A disagreement over agreement and consensus in constructionist sociology. Journal of Theory Social Behaiour 23: 1–25 Collins H M 1985 Changing Order: Replication and Induction in Scientific Practice. Sage, London Garfinkel H 1967 Studies in Ethnomethodology. Prentice Hall, Englewood Cliffs, NJ Garfinkel H 1974 On the origins of the term ‘ethnomethodology.’ In: Turner R (ed.) Ethnomethodology. Penguin, Harmondsworth, UK Garfinkel H (ed.) 1986 Ethnomethodological Studies of Work. Routledge and Kegan Paul, London Garfinkel H 1991 Respecification: Evidence for locally produced, naturally accountable phenomena of order, logic, reason, meaning, method, etc. in and as of the essential haecceity of immortal ordinary society (I)—an announcement of studies. In: Button G (ed.) Ethnomethodology and the Human Sciences. Cambridge University Press, Cambridge, UK Garfinkel H, Lynch M, Livingston E 1981 The work of a discovering science construed with materials from the optically discovered pulsar. Philosophy of the Social Sciences 11: 131–58 Garfinkel H, Sacks H 1970 On formal structures of practical actions. In: McKinney J C, Tiryakian E A (eds.) Theoretical Sociology: Perspecties and Deelopment. Appleton-CenturyCrofts, New York Gilbert G N, Mulkay M 1984 Opening Pandora’s Box: An Analysis of Scientists’ Discourse. Cambridge University Press, Cambridge, UK Gooding D, Pinch T, Schaffer S (eds.) 1989 The Uses of Experiment. Cambridge University Press, Cambridge, UK
Goodwin C 1994 Professional vision. American Anthropologist 96: 606–33 Habermas J 1984 The Theory of Communicatie Action. Volume I: Reason and the Rationalization of Society. Beacon Press, Boston Heritage J 1984 Garfinkel and Ethnomethodology. Polity Press, Oxford, UK Knorr K 1981 The Manufacture of Knowledge. Pergamon, Oxford, UK Latour B 1986 Will the last person to leave the social studies of science please turn on the tape recorder. Social Studies of Science 16: 541–8 Latour B, Woolgar S 1979 Laboratory Life: The Social Construction of Facts. Sage, London Livingston E 1986 The Ethnomethodological Foundations of Mathematics. Routledge and Kegan Paul, London Lynch M 1985 Art and Artifact in Laboratory Science. Routledge and Kegan Paul, London Lynch M 1993 Scientific Practice and Ordinary Action: Ethnomethodology and Social Studies of Science. Cambridge University Press, New York Lynch M, Bogen D 1994 Harvey Sacks’s primitive natural science. Theory, Culture and Society 11: 65–104 Maynard D, Manzo J 1993 On the sociology of justice: Theoretical notes from an actual jury deliberation. Sociological Theory 11: 171–93 Parsons T 1937 The Structure of Social Action. Free Press, New York, Vols. I & II Sacks H 1992 Lectures on Conersation. Blackwell, Oxford, UK, Vol. I Schutz A 1962 Collected Papers. Martinus Nijhoff, The Hague, Vol. I Sharrock W, Anderson B 1991 Epistemology: Professional scepticism. In: Button G (ed.) Ethnomethodology and the Human Sciences. Cambridge University Press, Cambridge, UK Suchman L 1987 Plans and Situated Actions. Cambridge University Press, Cambridge, UK Winch P 1990[1958] The Idea of a Social Science and its Relation to Philosophy, 2nd edn. Humanities Press, Atlantic Highlands, NJ Woolgar S 1981 Interests and explanations in the social study of science. Social Studies of Science 11: 365–94 Woolgar S 1998 Science: The Very Idea. Tavistock, London Zimmerman D H, Pollner M 1970 The everyday world as a phenomenon. In: Douglas J D (ed.) Understanding Eeryday Life: Toward the Reconstruction of Sociological Knowledge. Aldine, Chicago
M. Lynch
Science and Technology Studies: Experts and Expertise 1. Introduction Science is a social activity to construct, justify, and critique cognitive claims based on widely accepted methodologies and theories within relevant communi13647
Science and Technology Studies: Experts and Expertise ties. The term science is used here to include claims of true statements about any phenomenon, be it natural, anthropological, social, or even metaphysical. Methods or procedures are clearly distinct among the various domains of science, but the main assertion here is that the overall goal of claiming truthful statements remains the essence of scientific inquiry throughout all disciplines (similar attempts in Campbell 1921, pp. 27–30). Using scientific expertise, however, is not identical with generating scientific statements (Lindblom and Cohen 1979, p. 7ff.). In a policy arena, scientific experts are expected to use their skills and knowledge as a means of producing arguments and insights for identifying, selecting, and evaluating different courses of collective action. Since such advice includes the prediction of likely consequences of political actions in the future, experts are also in demand to give advice on how to cope with uncertain events and how to make a prudent selection among policy options, even if the policy-maker faces uncertain outcomes and heterogeneous preferences (Cadiou 2001, p. 27). Many policy-makers expect scientific experts to help construct strategies that promise to prevent or mitigate the negative and promote the positive impacts of collective actions. In addition, scientific expertise is demanded as an important input to design and facilitate communication among the different stakeholders in debates about technology and risk. Based on these expectations, scientific expertise can assist policy-makers to meet five major functions (similar in Renn 1995): (a) providing factual insights that help policymakers to identify and frame problems and to understand the situation (enlightenment function); (b) providing instrumental knowledge that allows policy-makers to assess and evaluate the likely consequences of each policy option ( pragmatic or instrumental function); (c) providing arguments, associations, and contextual knowledge that helps policy-makers to reflect on their situation and to improve and sharpen their judgment (reflexie function); (d) providing procedural knowledge that helps policy-makers to design and implement procedures for conflict resolution and rational decision making (catalytic function); and (e) providing guidelines or designing policy options that assist decision-makers in their effort to communicate with the various target audiences (communicatie function). These five functions touch on crucial aspects of policy-makers’ needs. First, insights offered by experts help policy-makers to understand the issues and constraints of different policy options when designing and articulating policies. Policy-makers need background information to develop standards, to ground economic or environmental policies on factual knowledge, and to provide information about the success or 13648
failure of policies. Second, scientific methods and their applications are needed to construct instrumental knowledge in the format of ‘if–then’ statements and empirically tested theories; this knowledge leads to the articulation of means-ends oriented policies and problem solving activities. Third, scientific reasoning and understanding help policy-makers to reflect on their activities and to acknowledge social, cultural, institutional, and psychological constraints as well as opportunities that are not easily grasped by common sense or instrumental reasoning. However, scientific statements may also restrict policy-makers as they are directed towards adopting a single perspective in analyzing and framing a problem. Fourth, policymakers may use scientists to design procedures of policy formulation and decision making in accordance with normative rules of reasoning and fairness. These procedures should not interfere with the preferences of those who are involved in the decision-making process, but provide tools for making these preferences the guiding principle of policy selection. To meet this function, scientists need to play a role similar to a chemical catalyst by speeding up (or if necessary slowing down) a process of building consensus among those who are entitled to participate in the policymaking process (Fishkin 1991). Lastly, scientific experts can help to design appropriate communication programs for the purpose of legitimizing public policies as well as preparing target audiences for their specific function or role in the task of risk management. This article focuses predominantly on the influence of scientific and technical expertise, in particular the results of technology assessments, on public policymaking. The second section deals with the risks and challenges of technical experts providing input to policy design and implementation. The third section addresses the influence of systematic knowledge for policy-making. The fourth section provides some theoretical background for the role of expertise in deliberative processes. The fifth section focuses on cultural differences in the use of expertise for policymaking. The last section summarizes the main points of this article.
2. Using Scientific Expertise for Policy-making: Risks and Challenges The interaction between experts and policy-makers is a major issue in technology management today and is likely to become even more important in the future. This is due in the first instance to the increased interactions between human interventions and natural responses and, secondarily, to the increased complexity of the necessary knowledge for coping with economic, social, and environmental problems. Population growth and migration, global trade, inter-
Science and Technology Studies: Experts and Expertise national market structures, transboundary pollution, and many other conditions of modern life have increased the sensitivity to external disturbances and diminished the capability of social and natural systems to tolerate even small interventions. Although contested by some (Simon 1992), most analysts agree that ecological systems have become more vulnerable as the impact of human intervention has reached and exceeded thresholds of self-repair (Vitousek et al. 1986). Given this critical situation, what are the potential contributions of expertise to the policy process? In principle, experts can provide knowledge that can help to meet the five functions mentioned above and to anticipate potential risks before they materialize. But they can do this only to the degree that the state of the art in the respective field of knowledge can provide reliable information pertaining to the policy options. Many policy-makers share assumptions about expertise that turn out to be wishful thinking or illusions (Funtowicz and Ravetz 1990, Jasanoff 1990, 1991, Rip 1992, Beck 1992). Most prominent among these are: (a) illusion of certainty: making policy-makers more confident about knowing the future than is justified; (b) illusion of transferability: making policy-makers overconfident that certainty in one aspect of the problem applies to all other aspects as well; (c) illusion of ‘absolute’ truth: making policymakers overconfident with respect to the truthfulness of evidence; (d) illusion of ubiquitous applicability: making policy-makers overconfident in generalizing results from one context to another. These illusions are often reinforced by the experts themselves. Many experts feel honored to be asked by powerful agents of society for advice. Acting under the expectation of providing unbiased, comprehensive, and unambiguous advice, they often fall prey to the temptation to oversell their expertise and provide recommendations far beyond their realm of knowledge. This overconfidence in one’s own expertise gains further momentum if policy-maker and advisor share similar values or political orientations. As a result policy-makers and consultants are prone to cultivate these illusions and act upon them. In addition to these four types of illusions, experts and policy-makers tend to overemphasize the role of systematic knowledge in making decisions. As much as political instinct and common sense are poor guides for decision making without scientific expertise, the belief that scientific knowledge is sufficient to select the correct option is just as short sighted. Most policy questions involve both systematic as well as anecdotal and idiosyncratic knowledge (Wynne 1989). Systematic knowledge often provides little insight into designing policies for concrete issues. For example, planning highways, supporting special industries, promoting health care for a community and many other
issues demand local knowledge on the social context and the specific history of the issue within this context (Wynne 1992, Jasanoff 1991). Knowledge based on local perspectives can be provided only by those actors who share common experiences of the issue in question. The role of systematic versus particularistic knowledge is discussed in more detajobil in the next section.
3. The Releance of Systematic Expertise for Policy-making There is little debate in the literature that the inclusion of expertise is essential as a major resource for designing and legitimizing technological policies (Jasanoff 1990). A major debate has evolved, however, on the status of scientific and technical expertise for representing all or most of the knowledge that is relevant to these policies. This debate includes two related controversies: the first deals with the problem of objectivity and realism; the second one with the role of subjective and experiential knowledge that nonexperts have accumulated over time. This is not the place to review these two controversies in detail (see Bradbury 1989, Shrader-Frechette 1991). Depending on which side one stands on in this debate, scientific evidence is either regarded as one input to fact-finding among others or as the central or even only legitimate input for providing and resolving knowledge claims. There is agreement, however, among all camps in this debate that systematic knowledge is instrumental for understanding phenomena and resolving problems. Most analysts also agree that systematic knowledge should be generated and evaluated according to the established rules or conventions of the respective discipline (Jaeger 1998, p. 145). Methodological rigor aiming to accomplish a high degree of validity, reliability and relevance remains the most important yardstick for judging the quality of scientific insights. Constructivist scholars in science and technology studies do not question the importance of methodological rules in securing credible knowledge but are skeptical whether the results of scientific inquiries represent objective or unambiguous descriptions of reality (Latour and Woolgar 1979, Knorr-Cetina 1981). Rather, they see scientific results as products of specific processes or routines that an elite group of knowledge producers has framed as ‘objective’ and ‘real.’ The ‘reality’ of these products is determined by the availability of research routines and instruments, prior knowledge and judgments, and social interests (see also Beck 1992, although he regards himself as a moderate realist). For the analysis of scientific input to policy-making, the divide between the constructivists and the realists matters only in the degree to which scientific input is used as a genuine knowledge base or as a final arbiter for reconciling knowledge conflicts. A knowledge 13649
Science and Technology Studies: Experts and Expertise discourse deals with different, sometimes competing claims that obtain validity only through compatibility checks with acknowledged procedures of data collection and interpretation, proof of theoretical compatibility and conclusiveness, and the provision of intersubjective opportunities for reproduction (Shrader-Frechette 1991, pp. 46ff.). Obviously many research results do not reach the maturity of proven facts, but even intermediary products of knowledge, ranging from plain hypotheses, via plausible deductions to empirically proven relationships, strive for further perfection (cf. the pedigree scheme of Functowicz and Ravetz 1990). On the other hand, even the most ardent proponent of a realist perspective will admit that only intermediary types of knowledge are often available when it comes to assess and evaluate risks. What does this mean for the status and function of scientific expertise in policy contexts? First, scientific input has become a major element of technological decision-making in all technologically developed countries. The degree to which the results of scientific inquiry are taken as ultimate evidence to judge the appropriateness and validity of competing knowledge claims is contested in the literature and also contested among policy-makers and different social groups. Frequently, the status of scientific evidence becomes one of the discussion points during social or political deliberation depending on the context and the maturity of scientific knowledge in the technological policy arena under question. For example, if the issue is the effect of a specific toxic substance on human health, subjective experience may serve as a heuristic tool for further inquiry and may call attention to deficits in existing knowledge, although toxicological and epidemiological investigations are unlikely to be replaced with intuitions from the general public. If the issue, by contrast, is siting of an incinerator, local knowledge about sensitive ecosystems or traffic flows may be more relevant than systematic knowledge about these impacts in general (a good example for the relevance of such particularistic knowledge can be found in Wynne 1989). Second, the resolution of competing claims of scientific knowledge usually are governed by the established rules within the relevant disciplines. These rules may not be perfect and even contested within the community. Yet they are regarded as superior to any other alternative (Shrader-Frechette 1991, pp. 190ff.). Third, many technological decision options require systematic knowledge that is either not available or still in its infancy or in an intermediary status. Analytic procedures are then demanded by policy-makers as a means to assess the relative validity of each of the intermediary knowledge claims, to display their underlying assumptions and problems, and to demarcate the limits of ‘reasonable’ claims, that is, to identify the range of those claims that are still compatible with the state of the art in this knowledge domain (Shrader-Frechette 1991). Fourth, knowledge 13650
claims can be systematic and scientific as well as idiosyncratic and anecdotal. Both forms of knowledge have a legitimate place in technological decisionmaking. How they are used depends on the context and the type of knowledge required for the issue in question (Wynne 1992). All four points show the importance of knowledge for technology policy and decision making, but also make clear that choosing the right management options requires more than looking at the scientific evidence alone.
4. Scientific Eidence in Deliberatie Processes Given the delicate balance between anecdotal, experiential, and systematic knowledge claims, policymaking depends on deliberative processes in which competing claims are generated or ignored, sorted, selected, and highlighted. The first objective is to define the relevance of different knowledge claims for making legitimate and defendable choices. The second objective is to cope with issues of uncertainty and to assign trade-offs between those who will benefit and those who will suffer from different policy options. The third objective is to take into account the wider concerns of the affected groups and the public at large. These three elets of deliberation are related to coping with the problems of complexity, uncertainty, and ambiguity. How can deliberative processes deal with the problems of complexity, uncertainty, and ambiguity? To respond to this question, it is necessary to introduce the different theoretical concepts underlying deliberative processes. The potential of deliberation has been discussed primarily in three schools of thought: (a) The utility-based theory of rational action (basics in Fisher and Ury 1981, Raiffa 1994; review of pros and cons in Friedman 1995): in this concept, deliberation is framed as a process of finding one or more option(s) that optimize the payoffs to each participating stakeholder. The objective is to convert positions into statements of underlying interests. If all participants articulate their interest, it is either possible to find a new win-win option that is in the interest of all or at least does not violate anybody‘s interest (Pareto optimal solution) or to find a compensation that the winner pays to the losers to the effect that both sides are at least indifferent between the situation without the preferred policy option and no compensation and the implementation of this option plus compensation (Kaldor–Hicks solution). In this context systematic knowledge is required to inform all participants of the likely consequences of each decision option. The evaluation of desirability of each option is a matter of individual preferences and in this perspective outside of the deliberation process. (b) Theory of communicative action (Habermas 1987, Webler 1995): this concept focuses on the communicative process of generating preferences,
Science and Technology Studies: Experts and Expertise values, and normative standards. Normative standards are those prescriptions that do not apply only to the participants of the discourse but also to society as a whole or at least a large segment of the external population. Normative standards in technological arenas include, for example, exposure limits or performance standards for technologies. They apply to all potential emitters or users regardless whether they were represented at the discourse table or not. The objective here is to find consensus among moral agents (not just utility maximizers) about shared meaning of actions based on the knowledge about consequences and an agreement on basic human values and moral standards. Systematic knowledge in this context helps the participants to provide insights into the potential effects of collective decision options and help them to reorganize their preferences according to mutually desirable outcomes. (c) Theory of social systems (Luhmann 1986, Eder 1992): the (neo)functional school of sociology pursues a different approach to deliberation. It is based on the assumption that each stakeholder group has a separate reservoir of knowledge claims, values, and interpretative frames. Each group-specific reservoir is incompatible with the reservoir of the other groups. This implies that deliberative actions do not resolve anything. They represent autistic self-expressions of stakeholders. In its cynical and deconstructivist version, deliberation serves as an empty but important ritual to give all actors the illusion of taking part in the decision process. In its constructive version deliberation leads to the enlightenment of decision-makers and participants. Far from resolving or even reconciling conflicts, deliberation in this viewpoint has the potential to decrease the pressure of conflict, provide a platform for making and challenging claims, and assist policymakers in making them cognizant of different interpretative frames (Luhmann 1993). Deliberations help to reframe the decision context, to make policymakers aware of public demands, and enhance legitimacy of collective decisions through reliance on formal procedures (Skillington 1997). In this understanding of deliberation, reaching a consensual conclusion is neither necessary nor desirable. The process of talking to each other, exchanging arguments, and widening one‘s horizon is all what deliberation is able to accomplish. It is an experience of mutual learning without a substantive message. Systematic knowledge in this context is never free of context and prescriptive assumptions. Hence, each group will make knowledge claims according to its interests and strategic goals. Integration of knowledge is based on rhetoric, persuasion skills, and power rather than established rules of ‘discovering the truth.’ These three understandings of deliberation are not mutually exclusive although many proponents of each school argue otherwise. Based on the previous arguments, it is quite obvious that the rational actor approach provides a theoretical framework to under-
stand how actors in deliberative processes deal with complexity and partially uncertainty. The communicative action approach provides a theoretical structure for understanding and organizing discourses on ambiguities and moral positions. In particular, this concept can highlight those elements of deliberation that help participants to deal competently with moral and normative issues beyond personal interests. The system–analytic school introduces some skepticism towards the claim of the other schools with respect to the outcomes of deliberation. Instead it emphasizes the importance of procedures, routines, and learning experiences for creating links or networks between the major systems of society. Deliberation is the lubricant that helps each of the collective social actors to move mostly independently in society without bumping into the domains of the other actors. Deliberative processes aimed at integrating experts, stakeholders, policy-makers, and the public at large can be organized in many different forms. Practical experiences have been made with advisory committees, citizen panels, public forums, consensus conferences, formal hearings, and others (see Rowe and Frewer 2000). A hybrid model of citizen participation (Renn et al. 1993) has been applied to studies on energy policies and waste disposal issues in West Germany, for waste-disposal facilities in Switzerland, and to sludge-disposal strategies in the United States (Renn 1999).
5. Cultural Styles in Using Scientific Expertise The way that knowledge and expertise are included in policy processes depends on many factors. Comparative research on the influence of systematic knowledge in policy processes emphasizes the importance of cultural context and historic developments (Solingen 1993). In addition, state structures and institutional arrangements significantly influence the type of inclusion of expertise into the decision-making processes. There has been a major shift in modern states to an organized and institutionalized exchange of science organizations with policy-making bodies (Mukerji 1989, Jasanoff 1990). Although science has become a universal enterprise, the specific meaning of what science can offer to policy-makers differs among cultures and nations (Solingen 1993). The situation is even more diverse when one investigates the use of science in different countries for advising policymakers. Scientific and political organizations partially determine what aspects of life are framed as questions of knowledge and what of ‘‘subjective’’ values. In addition, national culture, political traditions, and social norms influence the mechanisms and institutions for integrating expertise in the policy arenas (Wynne 1992). In one line of work, policy scholars have developed a classification of governmental styles that highlight 13651
Science and Technology Studies: Experts and Expertise four different approaches to integrating expert knowledge into public decisions (Brickman et al. 1985, Jasanoff 1986, O’Riordan and Wynne 1987, Renn 1995). These styles have been labeled inconsistently in the literature, but they refer to common procedures in different nations. The ‘adversarial’ approach is characterized by an open forum in which different actors compete for social and political influence in the respective policy arena. The actors in such an arena need and use scientific evidence to support their position. Policy-makers pay specific attention to formal proofs of evidence because policy decisions can be challenged on the basis of insufficient use or neglect of scientific knowledge. Scientific advisory boards play an important role as they help policy-makers to evaluate competing claims of evidence and to justify the final policy selection (Jasanoff 1990). A sharp contrast to the adversarial approach is provided by the fiduciary style (Renn 1995). The decision-making process is confined to a group of patrons who are obliged to make the ‘common good’ the guiding principle of their action. Public scrutiny or involvement is alien to this approach. The public can provide input to and arguments for the patrons but is not allowed to be part of the negotiation or policy formulation process. Scientists outside the policymaking circles are used as consultants at the discretion of the patrons and are selected according to prestige or personal affiliations. Their role is to provide enlightenment and information. Patrons’ staff generate instrumental knowledge. This system relies on producing faith in the competence and the fairness of the patrons involved in the decision-making process. Two additional styles are similar in their structure but not identical. The consensual approach is based on a closed circle of influential actors who negotiate behind closed doors. Representatives of important social organizations or groups and scientists work together to reach a predefined goal. Controversy is not visible and conflicts are often reconciled before formal negotiations take place. The goal of the negotiation is to combine the best available evidence with the various social interests that the different actors represent. The corporatist style is similar to the consensual approach, but is far more formalized. Well-known experts are invited to join a group of carefully selected policymakers representing major forces in society (such as employers, unions, churches, professional associations, and environmentalists). Invited experts are asked to offer their professional judgment, but they often do not need to present formal evidence for their claims. This approach is based on trust in the expertise of scientists. These four styles are helpful in characterizing and analyzing different national approaches to policymaking. The American system is oriented toward the adversarial style and the Japanese system toward the consensual. The policy style of northern Europe comes closest to the corporatist approach, whereas most 13652
southern European countries display a fiduciary approach. All these systems, however, are in transition. Interestingly the United States has tried to incorporate more consensual policies into its adversarial system, while Japan is faced with increasing demands for more public involvement in the policy process. These movements towards hybrid systems have contributed to the genesis of a new regulatory style, which may be called ‘mediative.’ There has been a trend in all technologically developed societies to experiment with opening expert deliberations to more varied forms of stakeholder or public participation. In the United States, it has taken the form of negotiated or mediated rule making, in Europe it has evolved as an opening of corporatist clubs to new groups such as the environmental movement. It is too early to say whether this new style will lead to more convergence among the countries or to a new set of cultural differentiations (Renn 1995).
6. Conclusions The economic and political structures of modern societies underwent rapid transitions in the late twentieth century. This transition was accompanied by globalization of information, trade, and cultural lifestyles, an increased pluralism of positions, values, and claims, the erosion of trust and confidence in governing bodies, an increased public pressure for participation, and growing polarization between fundamentalist groups and agents of progressive change. The resulting conflicts put pressure on political systems to integrate different outlooks and visions of the future and to provide justifications of governmental decisions on the basis of both facts and values. In this situation, policy-making institutions discovered an urgent need for policy advice, as well as new modes of integrating expertise with values and preferences. Research on advisory processes indicates that the following points will need to be addressed. (a) Using scientific expertise in the policy arena is one element in the quest of modern societies to replace or amend the collective learning process of trial and error by more humane methods of anticipation, in which the possibility of errors is reduced. This process is socially desired, though it cannot reduce the uncertainties of change to zero. Anticipation both necessitates and places new demands on expertise in the service of policy-making. (b) Scientific expertise can serve five functions: enlightenment; pragmatic or instrumental; reflexive; catalytic; and communicative. All five are in demand by policy-makers, but expertise may also distort their perspective on the issue or prescribe a specific framing of the problem. That is why many policy analysts demand that scientific input be controlled by democratic institutions and be open to public scrutiny. (c) Scientific expertise is also used for legitimizing
Science and Technology Studies: Experts and Expertise decisions and justifying policies that may face resistance or opposition. Expertise can, therefore, conflict with public preferences or interests. In addition, policy-makers and experts pursue different goals and priorities. Expertise should be regarded as one crucial element of policy-making among others. Scientific advice is often mandated by law, but its potential contributions may vary from one policy arena to another. In particular, scientific expertise cannot replace public input in the form of locally relevant knowledge, historical insights, and social values. (d) The influence of expertise depends on the cultural meaning of expertise in different social and political arenas. If systematic expertise is regarded as the outcome of a socially constructed knowledge system, its authority can be trumped by processes that are seen to be more democratic. If, however, systematic expertise is seen as an approximation of reality or truth, it gains a privileged status among different sources of knowledge and inputs, even if it is the product of a democratically imperfect process. The degree to which these different understandings of expertise are accepted or acknowledged within a political arena will affect the practical influence and power of experts in collective decision-making. (e) Scientific expertise is absorbed and utilized by the various policy systems in different styles. One can distinguish four styles: adversarial, fiduciary, consensual, and corporatist. A new mediative style seems to be evolving from the transitions toward more open procedures in decision-making. This style seems to be specifically adjusted to postmodern societies. Scientific expertise in this style pursues a ‘system and problem oriented’ approach to policy-making, in which science, politics, and economics are linked by strategic networks. (f ) Organizing and structuring discourses on the selection of policy options is essential for the democratic, fair, and competent management of public affairs. The mere desire to initiate a two-way-communication process and the willingness to listen to public concerns are not sufficient. Deliberative processes are based on a structure that assures the integration of technical expertise, regulatory requirements, and public values. Co-operative discourse is one model among others that has been designed to meet that challenge. No one questions the need to initiate a common discourse or dialogue among experts, policy-makers, stakeholders and representatives of affected publics. This is particularly necessary if highly controversial subjects are at stake. The main challenge of deliberative processes will continue to be how to integrate scientific expertise, rational decision-making, and public values in a coherent form. See also: Expert Systems in Cognitive Science; Expert Testimony; Expert Witness and the Legal System: Psychological Aspects; Expertise, Acquisition of;
Medical Expertise, Cognitive Psychology of; Policy History: Origins; Policy Knowledge: Universities; Policy Networks
Bibliography Beck U 1992 Risk Society: Toward a New Modernity (trans. Ritter M A). Sage, London Bradbury J A 1989 The policy implications of differing concepts of risk. Science, Technology, and Human Values 14(4): 380–99 Brickman R S, Jasonoff S, Ilgen T 1985 Controlling Chemicals: The Politics of Regulation in Europe and the United States. Cornell University Press, Ithaca, NY Cadiou J-M 2001 The changing relationship between science, technology and governance. The IPTS Report 52: 27–9 Campbell N 1951 (original 1921) What is Science? Dover, New York Eder K 1992 Politics and culture: On the sociocultural analysis of political participation. In: Honneth A, McCarthy T, Offe C, Wellmer A (eds.) Cultural–Political Interentions in the Unfinished Project of Enlightenment. MIT Press: Cambridge, MA, pp. 95–120 Fisher R, Ury W 1981 Getting to Yes: Negotiating Agreement without Giing In. Penguin Books, New York Fishkin J 1991 Democracy and Deliberation: New Directions of Democratic Reform. Yale University Press, New Haven, CT Friedman J (ed.) 1995 The Rational Choice Controersy. Yale University Press, New Haven, CT Funtowicz S O, Ravetz J R 1990 Uncertainty and Quality in Science for Policy. Kluwer, Dordrecht and Boston Habermas J 1987 Theory of Communicatie Action. Vol. II: Reason and the Rationalization of Society. Beacon Press, Boston Jaeger C 1998 Current thinking on using scientific findings in environmental policy making. Enironmental Modeling and Assessment 3: 143–53 Jasanoff S 1986 Risk Management and Political Culture. Russell Sage Foundation, New York Jasanoff S 1990 The Fifth Branch: Science Adisers as Policymakers. Harvard University Press, Cambridge, MA Knorr-Cetina K D 1981 The Manufacture of Knowledge: An Essay on the Constructiist and Contextual Nature of Science. Pergamon Press, Oxford, UK Latour B, Woolgar S 1979 Laboratory Life: The Social Construction of Scientific Facts. Sage, Beverley Hills and London Lindblom C E, Cohen D K 1979 Usable Knowledge: Social Science and Social Problem Soling. Yale University Press, New Haven, CT Luhmann N 1986 The autopoiesis of social systems. In: Geyer R F, van der Zouven J (eds.) Sociocybernetic Paradoxes: Obseration, Control and Eolution of Self-steering Systems. Sage, London, pp. 172–92 Luhmann N 1993 Risk: A Sociological Theory. Aldine de Gruyter, New York Mukerji C 1989 A Fragile Power: Scientists and the State. Princeton University Press, Princeton, NJ O’Riordan T, Wynne B 1987 Regulating environmental risks: A comparative perspective. In: Kleindorfer P R, Kunreuther H C (eds.) Insuring and Managing Hazardous Risks: From Seeso to Bhopal and Beyond. Springer, Berlin, pp. 389–410 Raiffa H 1994 The Art and Science of Negotiation, 12th edn. Cambridge University Press, Cambridge, UK
13653
Science and Technology Studies: Experts and Expertise Renn O 1995 Style of using scientific expertise: A comparative framework. Science and Public Policy 22: 147–56 Renn O 1999 A model for an analytic deliberative process in risk management. Enironmental Science and Technology 33(18): 3049–55 Renn O, Webler T, Rakel H, Dienel P C, Johnson B B 1993 Public participation in decision making: A three-step-procedure. Policy Sciences 26: 189–214 Rip A 1992 The development of restrictedness in the sciences. In: Elias N, Martins H, Whitely R (eds.) Scientific Establishments and Hierarchies. Kluwer. Dordrecht and Boston, pp. 219–38 Rowe G, Frewer L J 2000 Public participation methods: A framework for evaluation. Science, Technology & Human Values 225(1): 3–29 Shrader-Frechette K 1991 Risk and Rationality. Philosophical Foundations for Populist Reforms. University of California Press, Berkeley, CA Simon J L 1992 There is no environmental, population, or resource crisis. In: Tyler-Miller G (ed.) Liing in the Enironment. Wadsworth, Belmont, pp. 29–30 Skillington T 1997 Politics and the struggle to define: A discourse analysis of the framing strategies of competing actors in a ‘new’ participatory forum. British Journal of Sociology 48(3): 493–513 Solingen E 1993 Between models and the State: Scientists in comparative perspective. Comparatie Politics 26(1): 19–27 Vitousek P M, Ehrlich A H, Matson P H 1986 Human appropriation of the products of photosynthesis. Bio Science 34: 368–73 Webler T 1995 ‘Right’ discourse in citizen participation. An evaluative yardstick. In: Renn O, Webler T, Wiedemann P (eds.) Fairness and Competence in Citizen Participation. Kluwer, Dordrecht and Boston, pp. 35–86 Wynne B 1989 Sheepfarming after Chernobyl. Enironment 31: 11–15, 33–9 Wynne B 1992 Uncertainty and environmental learning. Reconceiving science and policy in the preventive paradigm. Global Enironmental Change 2: 111–27
O. Renn
Science and the Media The phrase ‘science and the media’ as a single concept encompasses two major social institutions. Science is both a system of reliable knowledge about the natural world and a complex social system for developing and maintaining that knowledge. Similarly, the media comprise a variegated system for collecting and presenting information as well as being an institution with significant economic, political, and social impact. Though long perceived as separate realms, recent scholarship has highlighted their mutual dependence. Increasing episodes of tension and conflict in the late twentieth century led scholars, scientists, media producers, and social critics to explore the interactions of science and the media. Working scientists and media producers sought guidance on how to use each others’ resources more effectively, while analysts asked how such issues as political and economic interests, rhe13654
torical conventions, and audience responses shed light on the roles of science and media in shaping each other. One effect of recent scholarship has been to elide the difference between ‘media’ (all forms of media) and ‘the media’ (particularly the major institutions of newspapers, magazines, television, and radio), an elision that will be evident throughout this article.
1. Science and the Media: A Brief History The history of science and the media is the history of a growing mesh: increasing use of media by science, increasing attention to scientific ideas by media institutions, and increasing tensions caused by the rising interaction. Through the middle of the twentieth century, science and media were largely seen as separate realms. Science had emerged in the nineteenth century from natural philosophy as a systematic approach to understanding the natural world. With the advent of research-based industries such as chemical dyes and electronic communication, science began to attract the attention of capitalists and government leaders who understood the value of controlling scientific knowledge. By the end of World War II, the emergence of ‘big science’ had made science a powerful institutional force as well as a system of accredited knowledge with increasingly far-reaching applications. The media had also evolved in the nineteenth century from small, local organizations into nationally and internationally-circulated publications that served as tools of merchant capitalism and political control, full of advertising, business and political news, and the ideology of the industrial revolution. By the early twentieth century, new electronic media such as movies and radio provided the means for newly-developed forms of ‘publicity’ and propaganda to help produce modern mass society, as well as to increase public access to new forms of cultural entertainment. Since the late seventeenth century, natural philosophers had relied on print media, particularly professional journals such as the Philosophical Transactions and books such as Isaac Newton’s Principia, as tools for disseminating the results of their investigations. These publications were not only records of experiments or philosophical investigations, but also served as active carriers of the rhetorical structures that enabled natural philosophers to convince people outside their immediate vicinity of the truth of their claims (Shapin and Schaffer 1985). Early in the nineteenth century, new forms of science media emerged, particularly a genre known variously as ‘popularization,’ ulgarisation [French], or diulgacion [Spanish]. These books and magazines provided entre! e into the knowledge of the natural world for non-scientist readers. They ranged from textbooks for young women through to new mass-circulation magazines filled with instruction on the latest achievements
Science and the Media of a rapidly industrializing society. The so-called ‘great men’ of late nineteenth century England (most notably Thomas Huxley and John Tyndall) became evangelists for science, lecturing widely to enlist public support for the enterprise of rational understanding of the natural world; they converted their lectures into articles and books to spread their reach even further. At the same time, scientists’ use of media for disseminating knowledge among the scientific community grew dramatically, with the growth of new journals and the founding of publishing houses committed to scientific information. The growth of both science and media in the twentieth century forced specialization into the fields, leading to a distinction between what some scientists have called ‘communication in science’ and ‘communication about science.’ For scientists in their daily work, broad synthetic journals such as Science and Nature (published respectively in the USA and the UK) were joined by topical journals in chemistry, physics, biology, and their many sub-disciplines. For non-scientists, magazines and other media (such as radio and new ‘wire services’ for newspapers) increasingly focused on particular audiences, such as teachers, inventors, ‘high culture’ readers, general media users, and so on. Although scientific ideas appeared in some entertainment media, science in most non-technical media became a component of news coverage for different audiences. In the 1920s and 1930s, a small group of professional journalists began to write almost exclusively about science. (These developments were clearest in the highly developed countries of Europe and North America. The patterns in the rest of the world have been little studied and so cannot be described.) By the second half of the twentieth century, the interaction of science and media had become a complex web, in which new scientific research appeared in abstracts, journals, press reports, and online, while other forms of science–not necessarily derivative of the research reports–appeared regularly in media outlets such as newspapers, televisions, websites, radio programs, puppet shows, traveling circuses, and museums. Moreover, as science became central to modern culture, the presence of science in media such as poetry, sculpture, fine art, entertainment films, and so on, became more evident. In the journalistic media, many reports circulated internationally, through cooperative agreements between public broadcasting systems and worldwide news services. The rise of the World Wide Web in the 1990s provided further opportunities for information about science to circulate more fully between the developed and developing world. However, from about the mid-century onward, no matter what media were involved, many working scientists perceived presentations in non-technical media to be incorrect, oversimplified, or sensationalized. Thus, tensions developed between the institu-
tions of science and those of the media. Those tensions led eventually to the development of an analytic field of science and media, populated largely by media sociologists, communication content scholars, and sociologists of science.
2. Understanding Science and Media Analysis of science and media has led to a deeper understanding of the essential role of media in creating scientific knowledge, the ways that media presentations shape understandings of nature, and the role of public debate in constituting scientific issues. The earliest attempts to understand the relationship of science and media came from two directions: the overwhelming growth of scientific information, leading to attempts by scientists and scientific institutions to learn how to manage scientific publications, and emerging tensions between scientists and journalists, as specialized reporting developed in the 1930s. Research and publications before and after World War II led by the 1970s to two common understandings. First, sociologists of science had come to understand the fundamental role of communication in the production of reliable knowledge about the natural world. Scientists do not work in isolation, but must constantly present their ideas to colleagues for acknowledgement, testing, modification, and approbation. The process of communication, occurring in various media, is the ‘essence of science’ (Garvey 1979), leading to science becoming ‘public knowledge’ (Ziman 1968). At the same time, the apparent differences between the goals of science (methodical, tentative, but nonetheless relatively certain statements about the natural world) and the needs of public media (rapid, attention-getting narratives about issues directly related to readers and viewers, regardless of certainty) had also been explicated (Krieghbaum 1967), leading to practical attempts to ‘bridge the gap’ between science and the media but also to recurring fears that the gap was unbridgeable. Beginning in the 1970s, new developments in the sociology of scientific knowledge opened up new ways of conceiving of the relationship between science and media (Barnes 1974, Latour and Woolgar 1979). In particular, attention to the rhetorical goals of scientists in their use of media suggested that a distinction between communication in science and communication about science could not be maintained. Instead, some researchers began to discuss the ‘expository’ nature of science, showing how scientists tailored their communication to meet the needs of specific media and specific communication contexts (Bazerman 1988, Shinn and Whitley 1985). These researchers highlighted the interaction of the idealized intellectual goals of science (for the production of reliable knowledge about the natural world) with the social and institutional goals of scientists and their employers or 13655
Science and the Media patrons (for priority, status in the community, ownership of ideas and thus patents, and so on). In this conception of science and media, scientific knowledge itself might differ in different presentations—in some cases, appearing as a ‘narrative of science’ (focusing on the methodological and theoretical structures leading to particular knowledge of the natural world), in other cases as a ‘narrative of nature’ (focusing on the relationships between organisms or entities, with emphasis on the ‘story’ linking different aspects of those organisms and entities) (Myers 1990). The very boundary between ‘science’ and ‘non-science’ (or ‘mere’ popularization), between communication in science and communication about science, was shown to be highly mutable, itself an object of rhetorical construction used for political purposes by participants in scientific controversies to establish control over elements of a debate (Hilgartner 1990). A second new line of inquiry highlighted the institutional interdependence of science and the media. Pointing to the growing post-World War II need for scientific institutions to generate public and political support, and to the media’s need to draw on scientific developments for constant infusions of drama and ‘newness,’ the new research identified ‘selling science’ as a major element of the political economy linking science and the media (Nelkin 1987). Finally, work on the images of science that appeared over time helped show the ways that social concerns interacted with scientific developments to create cultural representations of science (Weart 1988, LaFollette 1990, Nelkin and Lindee 1995). At the same time that these developments were occurring in the understanding of science communication, new approaches were developing in understanding the relationship between science and the public. Spurred by concerns in the scientific community about lack of ‘scientific literacy,’ the studies ultimately questioned the idea that the public has a ‘deficit’ of knowledge that needs to be ‘improved’ (Irwin and Wynne 1996). Instead, the new approach focused on the possibility of engaging the public in discussions about scientific issues, recognizing the contingent nature of understanding of scientific issues and thus the alternative meanings that might emerge in democratic discussions instead of authoritarian pronouncements (Sclove 1995). Much of the new work looked at issues of uncertainty and risk, highlighting the social negotiations required to make personal and policy decisions in a context of incomplete information (Friedman et al. 1999). The new work clearly implied that science and the media cannot be perceived as separate institutions with separate goals. Instead, better understanding comes from analyzing science and media across audiences, looking at the political and economic interests of media producers, at the rhetorical meanings of media conventions, and at the audience responses to media content (especially in terms of meaning-making). 13656
3. Approaches to Studying Science and Media 3.1 Political and Economic Interests of Media Producers The most detailed and well-developed understandings of science and the media point to the political and economic interests that motivate the producers of media representations of science. In many countries, the media depend on government subsidies or at least on political tolerance. Moreover, media are expensive to produce. While the presence of the media in democratic societies is often seen as fundamental to political liberty, the economic need to appeal to broad audiences (even for government-controlled media) often leads to editorial decisions that emphasize broad coverage and sensationalism over depth and sobriety. Science media are no different, and even the most prestigious scientific journals such as Science and Nature have been criticized for highlighting the most ‘newsworthy’ research, for hyping scientific reports beyond their value, for engaging in ‘selling science.’ Newspaper and television coverage of science is routinely criticized for its sensationalism, particularly in the coverage of controversial issues such as food contamination (e.g., the BSE or ‘mad cow’ epidemic in Britain in the 1980s and 1990s), genetic modification of food, and global climate change. Journalists (and their employers) routinely defend their practices on the basis of ‘what readers\viewers want,’ which is often judged by sales success. Entertainment media are criticized even more heavily for creating movie plots that involve impossible science or focus on highly unlikely scenarios; producers disclaim any responsibility for accuracy, claiming only to be providing (profitable) escapism. These recurring disputes between the scientific community and media producers highlight the irreconcilable tension between the goals of science and the goals of the media.
3.2 Rhetorical Meanings of Media Conentions An emerging area of understanding is the way that media conventions shape the rhetorical meaning of science communication. Scientific papers reporting original experimental research, for example, have developed a standardized format (called ‘IMRAD,’ for introduction, methods, results, analysis, discussion). The IMRAD format, conventionally written in an impersonal tone, with passive grammatical constructions, hides the presence of the researcher in the research, highlighting the objective character of scientific knowledge that is claimed to exist independently of its observer. In journalistic reports of scientific controversies, the conventions of objective reporting require ‘balance’—the presentation of both sides of a controversy (Bazerman 1988). Though 99.9 percent of all researchers may hold that a new claim
Science and the State (such as the eventually discredited 1989 announcement of a new form of ‘cold nuclear fusion’) is scientifically untenable, the journalistic norm of balance may lead to approximately equal attention to all claims. Similarly, journalistic goals of story telling often lead to emphasis on the human, narrative dimensions of an issue over the theoretical or mathematical components of the science. New research is exploring the meaning of visual representations of science, at every level from the technical outputs of genome analyzers to mass media manipulations of false-color images produced by orbiting astronomical telescopes (Lynch and Woolgar 1990).
3.3 Audience Responses to Media Content One of the most difficult areas to investigate is the audience response to particular media presentations. Virtually no work, for example, has been done on how scientists respond to specialized science media, despite anecdotal stories suggesting that changes in information presentation (such as World Wide Web access to scientific databases) may have dramatic impacts on how scientists conceive of the natural world and frame their hypotheses and theories. Some work has been done at broader public levels, such as responses to media presentations of risk information. But, although the psychology of risk perceptions, for example, has been extensively investigated, the factors affecting individual responses to media coverage of specific risky incidents are so varied as to defy measurement. Though the complexity leads some researchers to reject the possibility of quantitative analysis, others insist that social scientists can learn to isolate appropriate variables. Certainly some areas of audience response could be investigated more carefully; audience reaction to images of science in entertainment films, for example, has never been systematically tested, despite frequent statements that ‘the image of scientists in the movies is bad.’ The shift away from separate analysis of ‘science’ and ‘media’ to an integrated understanding of ‘science and media’ has led in recent years to the possibility of cross-cutting analyses, such as those of producers, rhetorical structures, and audience reception. These analyses will ultimately yield a much richer understanding of the interaction of science and media. Less clear is how such improved understanding will contribute to the practical concerns of those in the scientific community and elsewhere, who worry about possible linkages among science literacy, the image of science, and public support for science. See also: Educational Media; Genetics and the Media; Media and Child Development; Media and History: Cultural Concerns; Media Ethics; Media Events; Media, Uses of; Public Broadcasting; Public Relations in Media; Research Publication: Ethical Aspects
Bibliography Barnes B 1974 Scientific Knowledge and Sociological Theory. Routledge and Kegan Paul, London Bazerman C 1988 Shaping Written Knowledge: The Genre and Actiity of the Experimental Article in Science. University of Wisconsin Press, Madison, WI Friedman S M, Dunwoody S, Rogers C L (eds.) 1999 Communicating Uncertainty: Media Coerage of New and Controersial Science. Erlbaum Associates, Mahway, NJ Garvey W D 1979 Communication: The Essence of Science–Facilitating Information Exchange Among Librarians, Scientists, Engineers and Students. Pergamon Press, New York Hilgartner S 1990 The dominant view of popularization: conceptual problems, political uses. Social Studies of Science 20(3): 519–39 Irwin A, Wynne B (eds.) 1996 Misunderstanding Science? The Public Reconstruction of Science and Technology. Cambridge University Press, Cambridge, UK Krieghbaum H 1967 Science and the Mass Media. New York University Press, New York LaFollette M 1990 Making Science Our Own: Public Images of Science, 1910–1955. University of Chicago Press, Chicago Latour B, Woolgar S 1979 Laboratory Life. Sage, Beverly Hills, CA Lynch M, Woolgar S 1990 Representation in Scientific Practice. MIT Press, Cambridge, MA Myers G 1990 Writing Biology: Texts in the Social Construction of Scientific Knowledge. University of Wisconsin Press, Madison, WI Nelkin D 1987 Selling Science: How the Press Coers Science and Technology. Freeman, New York Nelkin D, Lindee M S 1995 The DNA Mystique: The Gene as a Cultural Icon. Freeman, New York Sclove R 1995 Democracy and Technology. Guilford, New York Shapin S, Schaffer S 1985 Leiathan and the Air-pump: Hobbes, Boyle, and the Experimental Life. Princeton University Press, Princeton, NJ Shinn T, Whitley R (eds.) 1985 Expository Science: Forms and Functions of Popularisation D Reidel, Dordrecht, The Netherlands, Vol. 9 Weart S 1988 Nuclear Fear: A History of Images. Harvard University Press, Cambridge, MA Ziman J M 1968 Public Knowledge: An Essay Concerning the Social Dimension of Science. Cambridge University Press, Cambridge, UK
B. V. Lewenstein
Science and the State 1. Introduction Since the sixteenth century, the ‘scientific revolution’ has generated a host of new, intellectual, rhetorical, and institutional strategies for undermining traditional political authorities and constructing alternative ones. Science became a vital resource in the modern attempt to discredit traditional political hierarchies and spiritual transcendental sources of authority in 13657
Science and the State the context of public affairs. By providing new rationales for order compatible with the novel modern commitments to the values of individualism, voluntarism, and egalitarianism, science and technology paradoxically also provided new grounds for novel modern forms of hierarchy and authority committed to the use of knowledge in the reconstruction of society. The widely shared notion that science goes along with progress appeared to suggest that knowledge, when applied to public affairs, can in fact depoliticize public discourse and action. This faith produced wide mandates for large-scale social and political engineering and monumental state-sponsored technological projects in both authoritarian and democratic states. While such ideals and projects were from the very beginning criticized by observers such as Edmund Burke (1729–97), throughout the twentieth century the record of the relations between science and politics turned out to be sufficiently mixed to discard earlier hopes. The controversial role of scientists in the production and military deployment of means of mass destruction, the role of scientists in the Nazi experiments on human beings and the endorsement of racism, and their involvement in the industrial pollution of the environment, diminished the public trust in science fostered by such developments as medical breakthroughs and spectacular space flights. Since the closing decades of the twentieth century, an increasing number of social scientists and historians have been pointing out that new uncertainties in the sphere of politics have been converging with newly recognized uncertainties of science and risks of technology to form novel postmodern configurations of state, science, and society (Eisenstadt 1999, Beck 1992). It has become increasingly recognized that, in ethnically and religiously heterogeneous and multicultural societies, science and technology can no longer be attached to uncontroversial comprehensive values that privilege claims of neutrality, objectivity, and rationality; nor can the earlier belief that knowledge can depoliticize public discourse and action be sustained any longer.
2. The Authorities of Science and the State The belief that science can provide a secular version of the synoptic divine view of human affairs was politically most significant. Until its erosion towards the end of the twentieth century, the modernist configuration of science and the state was formed over the course of at least four centuries. The secularization of political power in the modern state has put the dilemma of arbitrary rule at the center of modern political theory and practice. Whereas legal and constitutional constraints appeared to be the most appropriate remedy to this problem, the rise of modern 13658
science and technology encouraged faith in the possibility of restraint and moderation through public enlightenment and by means of scientific and technical advice. Not surprisingly, the redemptive drive behind the integration of science and politics had its origins in premodern religious visions and values (Manuel 1974). In both its authoritarian and democratic versions, this modernist program rested on the belief that the modern state can remedy the defects of the present sociopolitical order and realize an ideal order. Science contributed to this modernist outlook the faith in the power of secular knowledge to control nature and reconstruct society. In his dedication to Lorenzo di Medici of Florence in The Prince (1513), Niccolo Machiavelli implies that the holder of the God’s-eye view of the entire state is neither the King, who looks down at the people from the top of the hierarchy, nor the people, who see the King at the top from their lower state. It is a person like himself, a man of knowledge. The King, to be sure, has knowledge of the people, and the people have knowledge of the King but, because of his intellectual perspective, ‘a man of humble and obscure condition’ like Machiavelli can claim to see both the King and his subjects and understand the nature of their interactions. Political theory and political science thus claimed to have inherited the God’s-eye view of the whole polity and evolved a secular vision of the state as an object of knowledge. In turn, the nation-state has often found it useful to adopt the frame, if not always the content, of the inclusive scientific outlook in order to legitimate its interventions in the name of the general common good and its expressions in public goals such as security, health, and economic welfare. If, from the inclusive perspective of the state, all citizens, social groups, or institutions appeared as parts which the supreme power had the capacity to fit together or harmonize, science in its various forms often provided the technical means to rationalize, and the rhetorical strategies to depoliticize, such applications of power. The potential of science not only as a source of knowledge but also as a politically useable authority was already recognized by Thomas Hobbes. This founder of modern political theory was in contact with Francis Bacon, admired William Harvey’s findings about the circulation of the blood, and criticized Robert Boyle’s position on the vacuum (Shapin and Schaffer 1985). In his influential works on the state, Hobbes tried to buttress his claims by appealing to the authority of scientific, especially mathematical, certainties, hoping to enlist the special force of a language, which claims to generate proofs as distinct from mere opinions (Skinner 1996). Although he was suspicious of the experimental science of Boyle, Hobbes followed Machiavelli in describing politics in the language of causes rather than motives. Thomas Sprat, the historian of the Royal Society (1667), claimed that when compared to the contentious languages of religions, the temperate discourse of science can advance con-
Science and the State sensus across diverse social groups. In both the experimental and mathematical scientific traditions, however, knowledge, whether based on inferences or observations (or on their combination), was expected to end disputes. The association between science and consensus was regarded as deeply significant in a society which had begun to view authority and order as constructed from the bottom up rather than coming from above (Dumont 1986). Reinforced by the growing image of science as a cooperative, nonhierarchical, and international enterprise in which free individuals come to uncoerced agreements on the nature of the universe, scientific modes of reasoning and thinking appeared to suggest a model for evolving discipline and order among equals (Polanyi 1958, 1962). Deeply compatible with the notion that legitimate power and authority are constituted by social contracts, the rise of modern scientific knowledge appeared to accompany and reinforce the rise of the modern individual and his or her ability to challenge hierarchical forms of knowledge and politics. From Hobbes and Descarte, through Kant, J. S. Mill, and late-modern democratic thinkers such as Popper, Dewey, and Habermas, the commitment to the centrality of individual agency becomes inseparable from a growing faith in the possibility of expanding public enlightenment backed up by the demonstrated success of scientific modes of reasoning and acting.
3. Power and the Social Sciences In succeeding centuries, one of the most persistent ambitions of Western society has been to make conclusions touching the ‘things’ of society, law, and politics appear as compelling and impervious to the fluctuations of mere opinion as conclusions touching the operation of heaven and earth. The faith that natural scientists are bound by the ‘plain’ language of numbers to speak with an authority which cannot be corrupted by fragile human judgment was gradually extended to the fields of engineering and the social sciences (Porter 1986). The notion that experts who are disciplined by respect for objective facts are a symbol of integrity and can therefore serve as guardians of public virtues against the villains of politics and business was widely interpreted to include, beyond natural scientists, other categories of experts who speak the language of numbers. Social science disciplines like economics, social statistics, sociology, political science, and psychology adopted modes of observing, inferring, and arguing which appeared to deploy in the sphere of social experience notions of objective facts and discernible laws similar to those that the natural sciences had developed in relation to physical nature. The social sciences discovered that the language of quantification is not only a powerful tool in the production of knowledge of society but also a
valuable political and bureaucratic resource for depersonalizing, and thus legitimizing, the exercise of power (Porter 1995). In the context of social and political life the separation, facilitated by expert languages, between facts and values, causal chains and motives, appeared profoundly consequential. Until effectively challenged, mostly since the 1960s, such scientific and technical orientations to human affairs appeared to produce the belief that economic, social, political, and even moral ‘facts’ can be used to distinguish subjective or partisan from professional and apolitical arguments or actions. The apparent authority of science in the social and political context motivated the principal founders of modern ideologies like socialism, fascism, and liberalism to enlist science to their views of history, society, and the future. While theoretical and mathematified scientific knowledge remained esoteric, as religious knowledge was for centuries in the premodern state, the ethos of general enlightenment and the conception of scientific knowledge as essentially public made scientific claims appear agreeable to democratic publics even when these claims remained, in fact, elusive and removed from their understanding. In fields such as physics, chemistry, and medicine, machines, instruments, and drugs could often validate in the eyes of the lay public claims of knowledge which theories alone could not substantiate. The belief that science depoliticizes the grounds of state policies and actions, and subordinates them to objective and, at least in the eyes of some, transparent, professional standards, has been highly consequential for the uses of science in the modern state. Political leaders and public servants discovered that the appearance of transparency and objectivity can make even centralized political power seem publicly accountable (Price 1965, Ezrahi 1990). Moreover, the realization that the authority of science could in various contexts be detached from substantive scientific knowledge and deployed without the latter’s constraint often enhanced the political usefulness of both natural and social scientists independently of the value actually accorded to their professional judgments. Still, the uses of expert scientific authority in the legitimization of state politics and programs has often empowered scientists to actually exercise influence on the substance of government actions or to effectively criticize them from institutional bases outside the government (Jasanoff 1990). The actual uses of scientific expertise could often be just the consequence of a political demand for expert authority.
4. Science in Democratic and Authoritarian States In democratic societies, the integration of scientific authority, and sometimes also of scientific knowledge and technology, into the operations of the modern 13659
Science and the State state encouraged the development of a scientifically informed public criticism of the government. This process was greatly augmented by the rise of the modern mass media and the ability of scientists to call public attention to governmental failures traceable to nonuse or misuse of scientific expertise. The role of scientists in empowering public criticism of the uses of nuclear weapons and nuclear energy (Balough 1991), in grounding public criticism of the operation of such agencies as the National Aeronautics and Space Administration (NASA) and the Food and Drugs Administration (FDA), and in the criticism of public and private agencies for polluting the environment, illustrates the point. In authoritarian states like the Soviet Union, such cooperation between scientists, the mass media, and the public in criticizing government policies and actually holding the government accountable was, of course, usually repressed. In the authoritarian modernist version of science and the state, science was used mostly to justify, and partly to direct, comprehensive planning and control. Here, the force of scientific knowledge and authority rarely constrained the rationalization of centralization or warranted public skepticism towards the government or criticism of its actions (Scott 1998). Even in democratic states, however, science and technology were massively used to promote the goals of reconstruction, coordination, mass production, and uniformity. The redemptive role assumed by the political leadership in the name of equality, public welfare, and uniformity was no less effective in rationalizing state interventions than the totalistic utopianism enlisted to justify the direction and manipulation of society in authoritarian regimes. In all the variants of the modern state, bodies of expert knowledge such as statistics, demography, geography, macroeconomics, city planning, public medicine, mental health, applied physics, psychology, geology, and others were used to rationalize the creation of government organizations and semipublic regulatory bodies which had the mandate to shift at least part of the discretion held formerly by political, legal, and bureaucratic agents to experts (Price 1965, Wagner et al. 1991). In many democratic states, the receptiveness to experts within the state bureaucracy was manifest in the introduction of merit-based, selective, personnel recruitment procedures and a widening use of examinations. The cases of the former USSR and Communist China suggest, by contrast, the ease with which experts and their expertise could be controlled by political loyalists. Even where it was facilitated, however, the welding of bureaucrats and professionals was bound to be conflict-ridden. The hierarchical authority structure of bureaucratic organizations has inevitably come to both challenge and be challenged by the more horizontal structure of professional peer controls (Wilson 1989, Larson 1977). Nevertheless, the professionalization of fields of public policy and 13660
regulation boosted the deployment of more strictly instrumental frames of public actions in large areas. The synoptic gazes of power and science could thus converge in supporting such projects as holistic social engineering, city planning, and industrial scientific agriculture (Scott 1998).
5. Science, War, and Politics The most dramatic and consequential collaboration between science and the modern state took place during wars. Such collaboration was usually facilitated by the atmosphere of general mobilization, defining the war effort as the top goal of the nation. In a state of emergency, even democratic states have suspended, or at least limited, the competitive political process. Under such conditions, democracies have temporarily, albeit voluntarily, experienced the usual conditions of the authoritarian state: centralized command and control and a nationally coordinated effort. In the aftermath of such wars, politicians and experts often tried to persist in perpetuating the optimal conditions of their collaboration, regarding the resumption of usual political processes as disruptive of orderly rational procedures. In authoritarian nationalist or socialist states, open competitive politics was repressed as a matter of routine along with the freedom of scientific research, and the liberty of scientists to diffuse their findings and interpret them to students and the public at large (Mosse 1966). In such countries, the political elite usually faced the dilemma of how to exploit the resources of science for the war effort and for advancing its domestic goals without becoming vulnerable to the revolutionary potential of science as the expression of free reason, criticism, and what Robert K. Merton called ‘organized skepticism’ (1973). In the democratic state, the special affinities between the participatory ethos of citizens, free to judge and evaluate their government, and the values of scientific knowledge and criticism did not usually allow the coherences and the clarities of the war period to survive for long, and the relations between science and the state had to readjust to conditions of open political contests and the tensions between scientific advice to, or scientific criticism of, the state. Limitations imposed by open political contests were mitigated, however, in political cultures which encouraged citizens to evaluate and judge the government with reference to the adequacy of its performance in promoting public goals. The links between instrumental success and political legitimation preserved in such contexts the value of the interplay between expert advice to the government and to its public critics in the dynamic making of public policy and the construction of political authority. Political motives for appealing to the authority of science could of course be compatible with the readiness to ignore knowledge and
Science and the State adequate performance, although it did not need to contradict the willingness to actually use scientific knowledge to improve performance. Discrepancies between the uses of scientific authority and scientific knowledge became widespread, however, because of the growing gaps between the timeframes of science and politics in democratic societies. In fields such as public health, economic development, security, and general welfare, instrumentally effective policies and programs must usually be measured over a period of years, and sometimes even decades. But the politicians whose status has come to be mediated by the modern mass media need to demonstrate their achievements within the timeframe of months or even days or weeks. In contemporary politics, one month is a long time, and the life expectancy of issues on the public agenda is usually much shorter. Such a state of affairs means that political actors often lack the political incentives to invest in the costly resources that are required for instrumental success whose payoffs are likely to appear only during the incumbency of their political successors.
6. Science and the Transformation of the Modern Democratic State These conditions have encouraged leaders in democratic states increasingly to replace policy decisions and elaborate programs with grand political gestures (Ezrahi 1990). Where instant rewards can be obtained by well-advertised commitment to a particular goal or policy, the political motive to invest in substantive moves to change reality and improve governmental performance over the longer run may easily decline. Besides the tendency of democratic politics to privilege the present over the future in the allocation of public resources, the position of science in substantiating long-term instrumental approach to public policy was further destabilized by the fragmentation of the normative mandates of state policies. Roughly since the 1960s, the publics of most Western democracies have become increasingly aware of the inherent constraints on the determination and ordering of values for the purpose of guiding public choices. Besides the influence of ethnic, religious, linguistic, or cultural differences on this process, such developments as the feminist movement reflect even deeper pressures to reorder basic values. Feminist spokespersons demanded, for example, that financial and scientific resources directed to control diseases be redistributed to redress gender discrimination against the treatment of specifically female diseases. In part, the feminist critique has been but one aspect of wider processes of individuation, which, towards the later part of the twentieth century, have increased the diversity of identities, tastes, lifestyles, and patterns of association in modern societies. This new pluralism implied the necessity of
continually renegotiating the uses of science and technology in the social context and locally adapting it to the diverse balances of values and interests in different communities. In the course of the twentieth century, leading scientific voices like J. B. S. Haldane, J. B. Bernal, and Jacques Monod tended to treat almost any resistance to the application of advanced scientific knowledge in human affairs as an ‘abuse of science’ resulting from irrationalism, ignorance, or prejudice. These and many other scientists failed to anticipate the changes in social and political values which complicated the relations of science and politics during the twentieth century and the constraints imposed by the inherently competitive value environment of science and policy making in all modern states. The constant, often unpredictable, changes in the value orientations of democratic publics and other related developments have clearly undermined confidence in the earlier separation between facts and values, science and politics, technological and political structures of action. Such a central element of policy making as risk assessment, for example, which for a long time was regarded as a matter best left to scientists, was gradually understood as actually a hybrid process combining science and policy judgments (Jasanoff 1990, Beck 1992). In contemporary society, in which the respective authorities of science and the state were demystified, knowledge has come to be regarded as too complex to directly check power, and power as too diffused to direct or repress the production and diffusion of knowledge. Historically, one of the most important latent functions of science in the sociocultural construction of the democratic political universe was to warrant the faith in an objective world of publicly certifiable facts which can function as standards for the resolution or assessment of conflicts of opinions. To the extent that scientists and technologists could be regarded as having the authority to state the valid relations between causes and effects, they were believed to be vital to the attribution of responsibility for the desirable or undesirable consequences of public actions. Policies were regarded within this model of science and politics as hypotheses, which are subject to tests of experience before a witnessing public (Ezrahi 1990). Elements of this conception have corresponded to the modern political experience. Although the public gaze has always been mediated, and to some extent manipulated, by hegemonic elites, publics could usually see when a war effort failed to achieve the stated objectives, when a major technology like a nuclear reactor failed and endangered millions of citizens, or when economic policy succeeded in producing affluence and stability. The expansion of this conception of government responsibility and accountability to regions outside the West even made it possible to redescribe fatal hunger in wide areas in India, for instance, as the consequence of policy rather 13661
Science and the State than natural disaster (Sen 1981). Still, in large areas, the cultural and normative presuppositions of the neutral public realm as a shared frame for the nonpartisan or nonideological description and evaluation of reality have not survived the proliferation of the modern mass media and the spread of political and cultural pluralism. In contemporary democracies, the pervasive mediating role of largely commercialized electronic communication systems has diminished the power of the state to influence public perceptions of politics and accentuated the weight of emotional and esthetic factors relative to knowledge or information in the representation and construction of political reality. Influenced by diverse specialized and individualized electronic media, the proliferation of partly insulated micro-sociocultural universes within the larger society has generated a corresponding proliferation of incommensurable notions of reality, causality, and factuality.
7. Transformations of the Institutional Structure and Intellectual Orientations of Science Changes in the political and institutional environment of science in the modern state have been accompanied by changes in the internal institutional and intellectual life of science itself. In the modern democratic state, the status of science as a distinct source of certified knowledge and authority was related to its institutional autonomy vis-a' -vis the structure of the state and its independence from private economic institutions. Freedom of research and academic freedom were regarded as necessary conditions for the production and diffusion of scientific knowledge and, therefore, also for its powers to ground apolitical advice and criticism in the context of public affairs. Such institutional autonomy was, of course, never complete. The very conditions of independence and freedom had to be negotiated in each society and reflected at least a tacit balance between the needs of science and the state as perceived by the government. But the international universalistic ethos of science tended to ignore or underplay such local variations (Solingen 1994). The institutional arrangements that secured a degree of autonomy and freedom to scientists seemed to warrant the distinction between basic and applied research, between pure research directed to the advancement of knowledge and research and development directed to advance specific industrial, medical, or other practical goals. An important function of this distinction was to balance the adaptation of science to the needs of the modern state and the preservation of the internal intellectual traditions and practices of scientific research and academic institutions. In practice, the relations between science and the state were more symbiotic than the public ethos of science would have suggested. Scientific institutions, 13662
in order to function, almost always required public and political support and often also financial assistance. On the other hand, the state could not remain indifferent to the potential uses of scientific knowledge and authority to both secure adequate responses to problems and facilitate the legitimation of its actions by reference to apolitical expert authority. Thus, while the separation between truth and power and the respect for the boundaries between pure and applied sciences, as well as between politics and administration, were considered for a long time as the appropriate way to think about the relation between science and politics (Price 1965), the relations between science and the state, as well as between scientists and politicians, turned out to be much more interactive and conflictridden. Since scientists and politicians respectively controlled assets useful for each other’s work, they were bound to be more active in trying to pressure each other to cooperate and engage in mutually useful exchanges (Gibbons et al. 1994). Not all these exchanges were useful to either science or the state in the long run. The mobilization of scientists to the war efforts of the modern nation-state, while it boosted the status of scientists domestically in the short run, had deleterious effects on the international network of scientific cooperation, as well as on the independence of science in the long run. Especially in cases of internally controversial wars, like the American involvement in Vietnam, the mobilization of science inevitably split the scientific community and politicized the status of science. Nevertheless, the unprecedented outlays of public money made available to science in the name of national defense allowed many research institutions to acquire expansive advanced facilities and instruments that permitted the boosting of pure science in addition to militarily related research and development. But the gains in size and in potential scientific advance were obtained at the cost of eroding the glorious insulation of scientific research and exposing its delicate internal navigational mechanisms to the impact of external political values and institutions (Leslie 1993). Yielding to the pressures of the modern nation-state to make science more relevant and useful to more immediate social goals inevitably reduced the influence of internal scientific considerations and the priorities of scientific research (Guston 2000). Following such developments, the university, the laboratory, and the scientific community at large could often appear less elitist and more patriotic, but also more detached from their earlier affinities to humanistic culture and liberal values. Like the partnership between science and the state, the partnership between science and the market often facilitated by their collaboration during the war effort and postwar privatization of projects and services has undermined the autonomy of science and the ethos of basic research in many late modern states. But, whereas the interpenetration between science and the
Science and the State state subjected the internal intellectual values of science to the pressures of public goals and pressing political needs, the links between science and the private sector subordinated scientific norms to the private values of profit making. The conversion of scientific expertise into capital opened the way for substantial private support for the intellectual pursuits of science. But the linkages between science, industry, and capital only reinforced the decline of science and its institutions as a bastion of enlightenment culture, progress, and universal intellectual values. Thus, while scientific and technical knowledge and skills were increasingly integrated into a wider spectrum of industrial productions and commercial services, science as a whole became more amorphous, less recognizable or representable as a set of distinct institutions, and a community with a shared ethos (Gibbons et al. 1994). An instructive illustration of these processes is the pressure exerted by both the state and private business firms on scientists to compromise the cherished professional norm of publicity. In the context of science, the transparency of methodology and the publicity of research results have for a long time been reinforced by both ethical commitment and practical needs. The methodologies and findings of any research effort are the building blocks, the raw materials, with which scientists in discrete places produce more knowledge. Transparency and publicity are necessary for the orderly flow of the research process and the operation of science as a cooperative enterprise (Merton 1973). In addition, publicity has been a necessary element of the special status of science as public knowledge. From the very beginning, pioneering scientists such as Bacon, Boyle, and Lavoisier distinguished scientific claims from the claims of magic, alchemy, cabala, and other esoteric practices by the commitment to transparency. Transparency was also what rendered claims of scientific knowledge appear publicly acceptable by democratic citizens (Tocqueville 1945). Turning scientific research into a state secret in order to gain an advantage in war or into an economic (commercial) secret in order to gain an advantage in the market was not congenial to sustaining the cooperative system of science or the early luster of science as an embodiment of noble knowledge and virtues which transcended national loyalties and sectoral interests. In the aftermath of such developments, the virtues of objectivity, distinterestedness, universality, and rationality which were associated with earlier configurations of science (Polanyi 1958, 1962, Merton 1973) appeared unsustainable. Moreover, the fact that the resources of science and technology could be enlisted not only by liberal and democratic but also by fascist, communist, and other authoritarian regimes accentuated the image of science as an instrument insufficiently constrained by internal norms and flexible enough to serve contradictory, unprogressive, and extremely controversial causes.
8. The Postmodern Condition and the Reconfiguration of Science and the State These developments had a profoundly paradoxical impact on the status of science in the late-modern state. Especially since the closing decades of the twentieth century, while such projects as the decoding of the human genome indicate that scientific knowledge has advanced beyond even the most optimistic predictions, the authority of science in society and in the context of public affairs has suffered a sharp decline. In order to consider this state of affairs as reversible, one needs to believe that the insular autonomy of science can be restored and that the political and economic environments of science can become more congenial for sustaining this condition. Such expectations seem unwarranted. On the other hand, an apparent decline in the distinct social and political value of scientific authority may not necessarily undermine the impact of scientific knowledge on the ways states and governments act. What appears against the past as the decline of scientific authority in the larger social and political context may even be redescribed as a reconfiguration of the authority of expertise in the postmodern state. The earlier distinctions between science, politics, law, ethics, economy, and the like may have become irrelevant in complex contemporary societies, in which the use of scientific knowledge requires finer, more intricate, and continual adjustments and readjustments between science and other expressions of truth or power. In the absence of hegemonic ideological or national frames, the macropolitical sphere of the state divides into a multitude of—often just temporary and amorphous—subgroups, each organized around a particular order of values and interests. In each of these particular universes scientific knowledge and expertise can occupy a privileged authority as a means to advance shared goals. When these groups clash or compete in the wider political arena, their experts tend to take sides and function as advocates. The state and its regulatory institutions enlist their own experts to back up their particular perspectives. Within this new, more open-ended and dynamic system, science is more explicitly and reflexively integrated into social economic and political values. In the context of the new pluralist polity, the authority and knowledge of science may therefore be less distinctly visible and less relevant to the symbolic, ideological, or cultural aspects of politics. At the same time, it is no less present in the organization, management, and the normative constitution of social order. See also: Academy and Society in the United States: Cultural Concerns; Development and the State; Higher Education; History of Science: Constructivist Perspectives; Science and Industry; Science, Economics of; Science Funding: Asia; Science Funding: Europe; Science Funding: United States; Science, 13663
Science and the State Social Organization of; Science, Sociology of; Science, Technology, and the Military; Scientific Academies, History of; Scientific Academies in Asia; State and Society; Universities, in the History of the Social Sciences
Bibliography Balough B 1991 Chain Reaction: Expert Debate and Public Participation in American Commercial Nuclear Power 1945– 1975. Cambridge University Press, New York Beck U 1992 Risk Society: Towards a New Modernity. Sage Publications, London Dumont L 1986 Essays on Indiidualism, Modern Ideology in Anthropological Perspectie. University of Chicago Press, Chicago Eisenstadt S N 1999 Paradoxes of Democracy: Fragility, Continuity and Change. The Woodrow Wilson Center Press, Washington, DC Ezrahi Y 1990 The Descent of Icarus: Science and the Transformation of Contemporary Democracy. Harvard University Press, Cambridge, MA Ezrahi Y 1996 Modes of reasoning and the politics of authority in the modern state. In: Olson D R, Torrance N (eds.) Modes of Thought: Explorations in Culture and Cognition. Cambridge University Press, Cambridge, UK Gibbons M, Nowotny H, Limoges C, Schwartzman S, Scott P, Trow M (eds.) 1994 The New Production of Knowledge: The Dynamics of Science and Research in Contemporary Societies. Sage, London Guston D H 2000 Between Politics and Science: Assuring the Integrity of Research. Cambridge University Press, New York Jasanoff S 1990 The Fifth Branch: Science Adisers as Policymakers. Harvard University Press, Cambridge, MA Larson M S 1977 The Rise of Professionalism. University of California Press, Berkeley, CA Leslie S W 1993 The Cold War and American Science. Columbia University Press, Cambridge, MA Manuel F E 1974 The Religion of Isaac Newton. Clarendon Press, Oxford, UK Merton R K 1973 The normative structure of science. In: Storer N W (ed.) The Sociology of Science. University of Chicago Press, Chicago Mosse G L 1966 Nazi Culture. Schoken Bookes, New York Polanyi M 1958 Personal Knowledge: Towards a Post Critical Philosophy. University of Chicago Press, Chicago Polanyi M 1962 The Republic of Science: Its Political and Economic Theory. Minerva Press, pp. 54–73 Porter T M 1986 The Rise of Statistical Thinking 1820–1900. Princeton University Press, Princeton, NJ Porter T M 1995 Trust In Numbers, The Pursuit of Objectiity in Science and Public Life. Princeton University Press, Princeton, NJ Price D K 1965 The Scientific Estate. Belknap, Cambridge, MA Scott J C 1998 Seeing Like a State. Yale University Press, New Haven, CT Sen A K 1981 Poerty and Famines: An Essay on Entitlements and Depriation. Clarendon Press, Oxford, UK Shapin S, Schaffer S 1985 Leiathan and the Air Pump: Hobbes, Boyle and the Experimental Life. Princeton University Press, Princeton, NJ Skinner Q 1996 Reason and Rhetoric in the Philosophy of Hobbes. Cambridge University Press, Cambridge, UK
13664
Solingen E (ed.) 1994 Scientists and The State: Domestic Structures and the International Context. University of Michigan Press, Ann Arbor, MI Sprat Th [1667] 1958 History of the Royal Society. Washington University Press, St Louis, MO Tocqueville A de 1945 Democracy in America. In: Bradley Ph (ed.) Vintage Books, New York Wagner P, Weiss C H, Wittrock B, Wollman H 1991 Social Sciences and Modern States. Cambridge University Press, Cambridge, UK Wilson J Q 1989 Bureaucracy: What Goernment Agencies Do and Why They Do It. Basic Books, New York
Y. Ezrahi
Science, Economics of Determining the principles governing the allocation of resources to science as well as the management and consequences of the use of these resources are the central issues of the economics of science. Studies in this field began with the assumption that science was a distinct category of public spending that required rationalization. They have moved towards the view that science is a social system with distinct rules and norms. As the systems view has developed, the focus of the economics of science has moved from the effects of science on the economy to the influence of incentives and opportunities on scientists and research organizations. There is a productive tension between viewing science as a social instrument and as a social purpose. In the first view, science is a social investment in the production and dissemination of knowledge that is expected to generate economic returns as this knowledge is commercially developed and exploited. This approach has the apparent advantage that the standard tools of economic analysis might be directly employed in choosing how to allocate resources to science and manage their use. In the second approach, science is assumed to be a social institution whose norms and practices are distinct from, and only partially reconcilable with, the institutions of markets. While this second approach greatly complicates the analysis of resource allocation and management, it may better represent the actual social organization of science and the behavior of scientists, and it may therefore ultimately produce more effective rules for resource allocation and better principles for management. Both approaches are examined in this article, although it is the first that accounts for the majority of the economics of science literature (Stephan 1996).
1. The Economic Analysis of Science as a Social Instrument In arguing for a continuing high rate of public funding of science following World War II, US Presidential
Science, Economics of Science Advisor Vannevar Bush (1945) crafted the view that science is linked intrinsically to technological and economic progress as well as being essential to national defense. The aim of ‘directing’ science to social purposes was already well recognized, and had been most clearly articulated before the war by John D. Bernal (1939). What distinguished Bush’s argument was the claim that science had to be curiosity-driven and that, in companies, such research would be displaced by the commercial priorities of more applied research. The view that science is the wellspring of economic growth became well established within the following generation, giving rise to statements like ‘Basic research provides most of the original discoveries from which all other progress flows’ (United Kingdom Council for Scientific Policy 1967). The concept of science as a source of knowledge that would be progressively developed and eventually commercialized became known as the ‘linear model.’ In the linear model, technology is science reduced to practical application. The ‘linear model’ is an oversimplified representation that ignores the evidence that technological change is often built upon experience and ingenuity divorced from scientific theory or method, the role of technological developments in motivating scientific explanation, and the technological sources of instruments for scientific investigation (Rosenberg 1982). Nonetheless, it provides a pragmatic scheme for distinguishing the role of science in commercial society. If science is instrumental in technological progress and ultimately economic growth and prosperity, it follows that the economic theory of resource allocation should be applicable to science. Nelson (1959) and Arrow (1962) demonstrated why market forces could not be expected to generate the appropriate amount of such investment from a social perspective. Both Arrow and Nelson noted that in making investments in scientific knowledge, private investors would be unable to capture all of the returns to their investment because they could not charge others for the use of new scientific discoveries, particularly when those discoveries involved fundamental understanding of the natural world. Investment in scientific knowledge therefore had the characteristics of a ‘public good’, like publicly accessible roads. This approach established a basis for justifying science as a public investment. It did not, however, provide a means for determining what the level of that investment should be. Investments in public goods are undertaken, in principle, subject to the criterion that benefits exceed costs by an amount that is attractive relative to other investments of public funds. To employ this criterion, a method for determining the prospective returns or benefits from scientific knowledge is required. The uncertainty of scientific outcomes is not, in principle, a fundamental barrier to employing this method. In practice, it is often true that the returns from invest-
ments in public good projects are uncertain, and prospective returns often involve attributing to new projects the returns from historical projects. Griliches (1958) pioneered a methodology for retrospectively assessing the economic returns on research investment, estimating that social returns of 700 percent had been realized in the period 1933–55 from the $2 million of public and private investments on the development of hybrid corn from 1910–55. Other studies of agricultural innovation as well as a limited number of studies of industrial innovation replicated Griliches’ findings of a high social rate of return (see Steinmueller 1994 for references). Mansfield (1991) provides a fruitful approach for continuing to advance this approach. Mansfield asked R&D executives to estimate the proportion of their company’s products and processes commercialized in 1975–85 that could not have been developed, or would have been substantially delayed, without academic research carried out in the preceding 15 years. He also asked them to estimate the 1985 sales of the new products and cost savings from the new processes. Extrapolating the results from this survey to the total investment in academic research and the total returns from new products and processes, Mansfield concluded that this investment had produced the (substantial) social rate of return of 28 percent. The preceding discussion could lead one to conclude that the development of a comprehensive methodology for assessing the rate of return based on scientific research was only a matter of greater expenditure on economic research. This conclusion would be unwarranted. Efforts to trace the returns from specific government research efforts (other than in medicine and agriculture) have been less successful. The effort by the US Department of Defense Project Hindsight to compute the returns from defense research expenditures not only failed to reveal a positive rate of return, but also rejected the view that ‘any simple or linear relationship exists between cost of research and value received’ (Office of the Director of Defense Research and Engineering 1969). Similar problems were experienced when the US National Science Foundation sought to trace the basic research contributions underlying several major industrial innovations (National Science Foundation 1969). In sum, retrospective studies based on the very specific circumstances of ‘science enabled’ innovation or upon much broader claims that science as a whole contributes a resource for commercial innovation seem to be sustainable. When these conditions do not apply, as in the cases of specific research programs with uncertain application or efforts to direct basic research to industrial needs, the applicability of retrospective assessment, and therefore its value for resource allocation policy, is less clear. More fundamentally, imputing a return to investments in scientific research requires assumptions about the ‘counter-factual’ course of developments that 13665
Science, Economics of would have transpired in the absence of specific and identified contributions of science. In examples like hybrid corn or the poliomyelitis vaccine, a reasonable assumption about the ‘counter-factual’ state of the world is a continuation of historical experience. Such assumptions are less reasonable in cases where scientific contributions enable a particular line of development but compete with alternative possibilities or where scientific research results are ‘enabling’ but are accompanied by substantial development expenditures (David et al. 1992, Mowery and Rosenberg 1989, Pavitt 1993). For science to be analyzed as a social instrument, scientific activities must be interpreted as the production of information and knowledge. As the results of this production are taken up and used, they are combined with other types of knowledge in complex ways for which the ‘linear model’ is only a crude approximation. The result is arguably, and in some cases measurably, an improvement in economic output and productivity. The robustness and reliability of efforts to assess the returns to science fall short of standards that are employed in allocating public investment resources. Nonetheless, virtually every systematic study of the contribution of science to economy has found appreciable returns to this social investment. The goals of improving standards for resource allocation and management may be better served, however, by analyzing science as a social institution.
2. Science as a Social Institution The economic analysis of science as a social system begins by identifying the incentives and constraints that govern the individual choices of scientists and this may reflect persistent historical features of science or contemporaneous policies. Incentives may include tangible rewards such as monetary awards, intangible, but observable, rewards such as status, and less observable rewards such as personal satisfaction. Similarly, constraints should be interpreted broadly, including not only financial limitations but also constraints stemming from institutional rules, norms, and standards of practice. The following simplified account suggests one of several ways of assembling these elements into a useful analytical framework. Becoming a scientist requires substantial discipline and persistence in educational preparation as well as skills and talents that are very difficult to assess. Scientific training may be seen as a filter for selecting from prospective scientists those who have the ability and drive to engage in a scientific career. In addition, the original work produced during research training demonstrates the capacity of the researcher and provides a means for employers to assess the talents of the researcher (David 1994). Analyzing science education as an employment filter is a complement to 13666
more traditional studies of the scientific labor market such as those reviewed by Stephan (1996). The employment filter approach may also waste human resources by making schooling success the only indicator of potential for scientific contribution. If, for example, the social environment of the school discourages the participation or devalues the achievement of women or individuals from particular ethnic groups, the filter system will not perform as a meritocracy. The distinctive features of science as a social system emerge when considering the incentives and constraints facing employed scientists. Although there is a real prospect of monetary reward for outstanding scientific work (Zuckerman 1992), many of the incentives governing scientific careers are related to the accumulation of professional reputation (Merton 1973). While Merton represented science as ‘universalist’ (open to claims from any quarter), the ability to make meaningful claims requires participation in scientific research networks, participation that is constrained by all of the social processes that exclude individuals from such social networks or fail to recognize their contribution. The incentive structure of seeking the rewards from professional recognition, and the social organization arising from it, is central to the ‘new economics of science’ (Dasgupta and David 1994). The new economics of science builds upon sociological analyses (Cole and Cole 1973, Merton 1973, Price 1963) of the mechanisms of cumulative reinforcement and social reward within science. From an economic perspective, the incentive structure governing science is the result of the interactions between the requirement of public disclosure and the quest for recognition of scientific ‘priority’, the first discovery of a scientific result. Priority assures the alignment of individual incentives with the social goal of maximizing the scientific knowledge base (Dasgupta and David 1987). Without the link between public disclosure and the reward of priority, it seems likely that scientists would have an incentive to withhold key information necessary for the further application of their discoveries (David et al. 1999). As Stephan (1996) observes, the specific contribution of the new economics of science is in linking this incentive and reward system to resource allocation issues. Priority not only brings a specific reward of scientific prestige and status but also increases the likelihood of greater research support. Cumulative advantage therefore not only carries the consequence of attracting attention, it also enables the recruitment of able associates and students and provides the means to support their research. These effects are described by both sociologists of science and economists as the Matthew effect after Matthew 25:29, ‘For to every one who has will more be given, and he will have abundance; but from him who has not, even what he has will be taken away.’ As in the original parable, it
Science, Economics of may be argued that this allocation is appropriate since it concentrates resources in hands of those who have demonstrated the capacity to produce results. The race to achieve priority and hence to collect the rewards offered by priority may, however, lead to inappropriate social outcomes because priority is a ‘winner take all’ contest. Too many resources may be applied to specific races to achieve priority and too few resources may be devoted to disseminating and adapting scientific research results (Dasgupta and David 1987, David and Foray 1995), a result that mirrors earlier literature on patent and technology discovery races (Kamien and Schwartz 1975). Moreover, the mechanisms of cumulative advantage resulting from achieving priority may reduce diversity in the conduct of scientific research. This system has the peculiarity that the researchers who have the greatest resources and freedom to depart from existing research approaches are the same ones who are responsible for creating the status quo. The principal challenges to the view that science is a distinct social system are the growing number of scientific publications by scientists employed in private industry (Katz and Hicks 1996) and the argument that scientific knowledge is tightly bound to social networks (Callon 1994). Private investments in scientific research would appear to question the continuing validity of the ‘public good’ argument. For example, Callon (1994) contends that scientific results are, and have always been, strongly ‘embedded’ within networks of researchers and that ‘public disclosure’ is therefore relatively useless as a means of transfer for scientific knowledge. Gibbons et al. (1994) argue that research techniques of modern science have become so well distributed that public scientific institutions are no longer central to scientific activity. While the arguments of both Callon and Gibbons et al. suggest that private scientific research is a direct substitute for publicly funded research, other motives for funding and publication such as gaining access to scientific networks suggest that public and private research are complementary (David et al. 1999). The growing reliance of industry on science provides a justification for investing in science to improve the ‘absorption’ of scientific results (Cohen and Levinthal 1989, Rosenberg 1990). Employed scientists need to be connected with other scientific research colleagues who identify ‘membership’ in the scientific community with publication, and labor force mobility for employed scientists requires scientific publication. Thus, it is premature to conclude that the growing performance of scientific research in industry or publication of scientific results by industrial authors heralds the end of the need for public support of science. The growing significance of private funding of scientific research does, however, indicate the need to improve the socioeconomic analysis of the incentive and governance structures of science. Empirical work on the strategic and tactical behavior of indiv-
idual scientists, research groups, and organizations is urgently needed to trace the implications of the changing environment in which the social institutions of science are evolving. Ultimately, these studies should be able to meet the goal of developing better rules for allocating and managing the resources devoted to science. See also: Innovation, Theory of; Research and Development in Organizations; Research Funding: Ethical Aspects; Science and the State; Science Funding: Asia; Science Funding: Europe; Science Funding: United States; Science, Technology, and the Military
Bibliography Arrow K J 1962 Economic Welfare and the Allocation of Resources for Invention. The Rate and Direction of Inentie Actiity. National Bureau of Economic Research, Princeton University Press, Princeton, NJ Bernal J D 1939 The Social Function of Science. MIT Press, Cambridge MA. MIT Press Paperback edition 1967 Bush V 1945 Science: The Endless Frontier: A Report to the President on a Program for Postwar Scientific Research. United States Office of Scientific Research and Development, Washington DC. National Science Foundation reprint 1960 Callon M 1994 Is Science a Public Good? Fifth Mullins Lecture, Virginia Polytechnic Institute, 23 March 1993. Science, Technology and Human Values 19: 395–424 Cohen W M, Levinthal D A 1989 Innovation and learning: The two faces of R&D. Economic Journal 99(397): 569–96 Cole J R, Cole S 1973 Social Stratification in Science. University of Chicago Press, Chicago, IL Dasgupta P, David P A 1987 Information disclosure and the economics of science and technology. In: Feiwel G R (ed.) Arrow and the Ascent of Modern Economic Theory. New York University Press, New York pp. 519–40 Dasgupta P, David P A 1994 Toward a new economics of science. Research Policy 23(5): 487–521 David P A 1994 Positive feedbacks and research productivity in science: reopening another black box. In: Granstrand O (ed.) Economics of Technology. North Holland, Amsterdam and London, pp. 65–89 David P A, Foray D 1995 Accessing and expanding the science and technology knowledge base. Science and Technology Industry Reiew 16: 13–68 David P A, Foray D, Steinmueller W E 1999 The research network and the new economics of science: From metaphors to organizational behaviours. In: Gambardella A, Malerba F (eds.) The Organization of Economic Innoation in Europe. Cambridge University Press, Cambridge, UK pp. 303–42 David P A, Mowery D, Steinmueller W E 1992 Analysing the economic payoffs from basic research. Economics of Innoation and New Technology 2: 73–90 Gibbons M, Limoges C, Nowotny H, Schwartzman S, Scott P, Trow M 1994 The New Production of Knowledge: The Dynamics of Science and Research in Contemporary Societies. Sage, London Griliches Z 1958 Research costs and social returns: hybrid corn and related innovations. Journal of Political Economy (October): 419–31 Kamien M I, Schwartz N L 1975 Market structure and innovation: a survey. Journal of Economic Literature 13(1): 1–37
13667
Science, Economics of Katz J S, Hicks D M 1996 A systemic view of British science. Scientometrics 35(1): 133–54 Mansfield E 1991 Academic research and industrial innovation. Research Policy 20(1): 1–12 Merton R 1973 The Sociology of Science: Theoretical and Empirical Inestigations. University of Chicago Press, Chicago, IL Mowery D C, Rosenberg N 1989 Technology and the Pursuit of Economic Growth. Cambridge University Press, Cambridge, UK National Science Foundation 1969 Technology in Retrospect and Critical Eents in Science (TRACES). National Science Foundation, Washington, DC Nelson R R 1959 The simple economics of basic scientific research. Journal of Political Economy 67(June): 297–306 Office of the Director of Defense Research and Engineering 1969 Project Hindsight: Final Report. Washington, DC Pavitt K 1993 What do firms learn from basic research. In: Foray D, Freeman C (eds.) Technology and the Wealth of Nations: The Dynamics of Constructed Adantage. Pinter Publishers, London Price D J, de Solla 1963 Little Science, Big Science. Columbia University Press, New York Rosenberg N 1982 Inside the Black Box: Technology and Economics. Cambridge University Press, Cambridge, UK Rosenberg N 1990 Why do firms do basic research (with their own money). Research Policy 19(2): 165–74 Steinmueller W E 1994 Basic science and industrial innovation. In: Dodgson M, Rothwell R (eds.) Handbook of Industrial Innoation. Edward Elgar, London pp. 54–66 Stephan P E 1996 The economics of science. Journal of Economic Literature XXXIV(September): 1199–235 United Kingdom Council for Scientific Policy (Great Britain) 1967 Second Report on Science Policy. HMSO, London Zuckerman H 1992 The proliferation of prizes: Nobel complements and Nobel surrogates in the reward system of science. Theoretical Medicine 13: 217–31
W. E. Steinmueller
Science Education The launch of Sputnik in 1957 almost single-handedly conferred on science education the status of an Olympic sport. A series of international comparison studies along with the increasing importance of science and technology in the global economy have nurtured this image (e.g., Schmidt et al. 1997). Stakeholders, including policy makers, natural scientists, textbook publishers, test designers, classroom teachers, business leaders, and pedagogical researchers, have responded with competing strategies for improving science education. In some countries, most notably the United States, powerful groups have mandated untested policies and unachievable educational goals—often driven by a single aspect of the problem such as textbooks, assessments, curriculum frameworks, peer learning, or technological innovation. Experience with failed policy initiatives and ambiguous research findings has highlighted the systemic, intricate, complex, interconnected nature of science 13668
education. Innovations succeed under some circumstances, but not others. Students regularly fail to learn what they are taught, prefer ideas developed from personal experiences, and make inferences based on incomplete information. Curriculum designers often neglect these aspects of learners and expect students to absorb all the information in a textbook or learn to criticize research on ecology by studying logic puzzles (Linn and Hsi 2000, Pfundt and Duit 1991). International studies of science learning in over 50 countries raise questions about the complex relationships between achievement and curriculum, or between interest in science and science learning, or between teachers’ science knowledge and student success that many have previously taken for granted. This systemic character of science education demands a more nuanced and contextual understanding of the development of scientific understanding and the design of science instruction, as well as new research methods, referred to as design studies, to investigate the impact of innovations.
1. Deeloping Scientific Understanding 1.1 Knowledge Integration Most researchers would agree that science learners engage in a process of knowledge integration, making sense of diverse information by looking for patterns and building on their own ideas (Bransford et al. 1999, Bruer 1993). Knowledge integration involves linking and connecting information, seeking and evaluating new ideas, as well as revising and reorganizing scientific ideas to make them more comprehensive and cohesive. Designing curriculum materials to support and guide the process of knowledge integration has proven difficult. Most textbooks and even hands-on activities reflect a view of learners as absorbing information rather than attempting to integrate new ideas with their existing knowledge. Recent research provides guidance to curriculum designers by describing the interpretive, cultural, and deliberate dimensions of knowledge integration. Learners interpret new material in light of their own ideas and experiences, frequently relying on personal perspectives rather than instructed ideas. For example, science learners often believe that objects in motion come to rest, based on their extensive observations of the natural world. Learning happens in a cultural context where group norms, expectations, and supports shape learner activity. For example, when confronted with the Newtonian view that objects in motion remain in motion, many learners conclude that objects come to rest on the playground but remain in motion in science class, invoking separate norms for these distinct contexts. Individuals make deliberate decisions about their science learning, develop commitments about reusing what they learn, pay attention to some science debates
Science Education but not others, and select or avoid science as a career. For example, some students desire a cohesive account of topics like motion and seek to explain new phenomena such as the role of friction in nanotechnology, while others report with pride that they have ‘forgotten everything taught in science class.’ The interpretive, cultural, and deliberate dimensions of science learning apply equally to pre-college students, preservice and in-service science teachers, research partnerships, and lifelong science learners. These perspectives help clarify the nature of knowledge integration and suggest directions for design of science instruction. 1.2 Interpretie Nature of Science Learning Learners develop scientific expertise by interpreting the facts, processes, and inquiry skills they encounter in terms of their own experiences and ideas. Experts in science develop richly connected ideas, patterns, and representations over the years and regularly test their views by interpreting complex situations, looking for anomalies, and incorporating new findings. For example, expert physicists use free body diagrams to represent mechanics problems while novices rely on the formulas they learn in class. Piaget (1971) drew attention to the ideas that students bring to science class such as the notion that the earth is round like a pancake, or that heavier objects displace more volume. Piaget offered assimilation and accommodation as mechanisms to account for the process of knowledge integration. Vygotsky (1962) distinguished spontaneous ideas developed from personal experience such as the view that heat and temperature are the same from the instructed distinction between heat and temperature. Recent research calls for a more nuanced view of knowledge integration by showing that few learners develop a coherent perspective on scientific phenomena: most students develop ‘knowledge in pieces’ and retain incohesive ideas in their repertoire (diSessa 2000). Even experts may give incohesive views of phenomena when asked to explain at varied levels of granularity. For example, scientists may have difficulty designing a picnic container to keep food safe even when they have expert knowledge of molecular kinetic theory (Linn and Hsi 2000). Science textbooks, lectures, films, and laboratory experiments often reinforce students’ incoherent views of science; they offer disconnected, inaccessible ideas, and avoid complex personally-relevant problems. For example, many texts expect students to gain understanding of friction by analyzing driving on icy roads. But non-drivers and those living in warm climates find this example unfamiliar and inaccessible. Similarly, research on heat and temperature reveals that difficulties in understanding the particulate nature of matter stand in the way of interpreting the molecularkinetic model. The engineering-based heat flow model
offers a more descriptive and accessible account of many important aspects of heat and temperature such as wilderness survival or home insulation or thermal equilibrium. Designing instruction to stimulate the interpretive process means carefully selecting new, compelling ideas to add to the views held by students and supporting students as they organize, prioritize, and compare these various ideas. This process of weighing alternative accounts of scientific phenomena can clash with student ideas about the nature of science and of science learning. Many students view science knowledge as established and science learning as memorization. To promote knowledge integration, students need to interpret dynamic examples of science in the making and they need to develop norms for their own scientific reasoning. Students need a nuanced view of scientific investigation that contrasts methodologies and issues in each discipline. General reasoning skills and critical thinking are not sufficient. Instead, students need to distinguish the epistemological underpinnings of methodologies for exploring the fossil record, for example, from those required for study of genetic engineering. They need to recognize pertinent questions for research on earthquake-resistant housing, DNA replication, and molecular modeling. Effective knowledge integration must include an understanding of the ethical and moral dilemmas involved in diverse scientific areas, as well as the nature of scientific advances. Research on how students make sense of science suggests some mechanisms to promote the interpretive process of knowledge integration. Clement (1991) calls for designing bridging analogies to help students make effective connections. For example, to help students understand the forces between a book and a table, Clement recommends comparing the experience of placing the book on a spring, a sponge, a mattress, and possibly, releasing it on air. Linn and Hsi (2000) call on designers to create pivotal cases that enable learners to make subtle connections and reconsider their views. For example, a pivotal scientific visualization of the relative rates of heat flow in different materials helped students interpret their personal observations that, at room temperature, metals feel colder than wood. To help students sort out alternative experiences, observations, school ideas, and intuitions, research shows the benefit of encouraging students to organize their knowledge into larger patterns, motivating students to critique alternative ideas, and establishing a taste for cohesive ideas. 1.3 Cultural Context of Science Learning All learning occurs in a cultural context where communities respond to competing perspectives, confer status on research methods, and establish group norms and expectations. Students benefit from the cultural 13669
Science Education context of science, for example, when they find explanations provided by peers more comprehensible than those in textbooks or when peers debate alternative views. Expert scientists have well established mechanisms for peer review, norms for publications, and standards for inquiry practices. Group norms are often institutionalized in grant guidelines, promotion policies, and journal publication standards. The cultural context of the classroom has its own characteristics that may or may not support and encourage the development of cohesive, sustained inquiry about science. Students may enact cultural norms that exclude those from groups underrepresented in science from the discourse (Wellesley College Center for Research on Women 1992, Keller 1983) or limit opportunities for students to learn from each other. Textbooks and standardized tests may privilege recall of information over sustained investigation. Images of science in curriculum materials and professional development programs may neglect societal or policy issues. Contemporary scientific controversies, such as the current international debate about genetically modified foods, rarely become topics for science classes. As a result, students may develop flawed images of science in the making and lack ability to make a critical evaluation of news accounts of personally-relevant issues. For example, to make decisions about the cultivation and consumption of genetically-modified foods, students would ideally compare the risks from traditional agricultural practices such as hybridization to the risks from genetic modification. They would also weigh issues of economics, world hunger, and individual health. In addition, students studying this controversy would learn to distinguish comments from scientists supported by the agricultural industry, environmental protection groups, and government grants. Examining knowledge integration from a cultural perspective helps to clarify the universality of controversy in science and the advantages of creating classroom learning communities that illustrate the role of values and beliefs in scientific work (Brown 1992, Bransford and Brown 1999). 1.4 Deliberatie Perspectie on Science Learning Students make deliberate decisions about science learning, their own progress, and their careers. Lifelong science learners deliberately reflect on their views, consider new accounts of scientific problems, and continuously improve their scientific understanding; they seek a robust and cohesive account of scientific phenomena. Designing instruction that develops student responsibility for science learning creates a paradox. In schools, science curriculum frameworks, standards, texts, and even recipe-driven hands-on experiences leave little opportunity for independence. Yet, the curriculum cannot possibly provide all the necessary information about science; instead, students 13670
need supervised practice in analyzing their own progress in order to guide their own learning. Many instructional frameworks offer mechanisms leading to self-guided, intentional learning (Linn and Hsi 2000, White and Frederiksen 1998). Vygotsky (1962) drew attention to creating a ‘zone of proximal development’ by designing accessible challenges so that students, supported by instruction and peers, could continue to engage in knowledge integration. Vygotsky argued that, when students encounter new ideas within their zone of proximal development and have appropriate supports, they can compare and analyze spontaneous and instructed ideas, achieve more cohesive understandings, and expand their zone of proximal development. With proper support students can even invent and refine representations of their experimental findings (diSessa 2000). Others have shown that engaging students in guided reflection, analysis of their own progress, and critical review of their own or others, arguments establishes a more deliberative stance towards science learning (Linn and Hsi 2000, White and Frederikson 1998).
2. Designing Science Instruction Design of science instruction occurs at the level of state and national standards, curriculum frameworks for science courses, materials such as textbooks or software, and activities carried out by both students and teachers. In many countries, a tension has emerged between standards that mandate fleeting coverage of a list of important scientific topics and concerns of classroom teachers that students lack opportunity to develop a disposition towards knowledge integration and lifelong science learning. As science knowledge explodes, citizens face increasingly complex sciencerelated decisions, and individuals need to regularly update their workplace skills. To make decisions about personal health, environmental stewardship, or career advancement, students need a firm foundation in scientific understanding, as well as experience interpreting complex problems. To lead satisfying lives, students need to develop lifelong learning skills that enable them to revisit and refine their ideas and to guide their own science learning in new topic areas. Recent research demonstrates the need to design and test materials to be sure they are promoting knowledge integration and setting learners on a path towards lifelong learning. Frameworks for design of science instruction for knowledge integration call for materials and activities that feature accessible ideas, make thinking visible, help students learn from others, and encourage self-monitoring (Bransford et al. 2000, Linn and Hsi 2000, White and Frederiksen 1998). 2.1 Designing Accessible Ideas To promote knowledge integration, students need a designed curriculum that includes pivotal cases and
Science Education bridging analogies to help them learn. Rather than asking experts to identify the most sophisticated ideas, designers need to select the most accessible and generative ideas to add to the mix of student views. College physics courses generally start with Newton rather than Einstein; in pre-college courses one might start with everyday examples from the playground rather than the more elegant but less understandable frictionless problems. To make the process of lifelong knowledge integration accessible, students need some experience with sustained, complex inquiry. Carrying out projects such as developing a recycling plan for a school or researching possible remedies for the worldwide threat of malaria, engage students in the process of scientific inquiry and can establish lifelong learning skills. Often, however, science courses neglect projects or provide less knowledge integration intensive experiences such as a general introduction of critical thinking or hands-on recipes for solving unambiguous problems. By using computer learning environments to help guide students as the carry out complex projects curriculum designers can foster a robust understanding of inquiry (Driver et al. 1996). Projects take instructional time, require guidance for individual students, bring the complexities of science to life, and depend on well-designed questions. Students often confuse variables such as food and appetite, rely on flawed arguments from advertisements or other sources, and flounder because they lack criteria for critiquing their own progress. For example, when students critique projects, they may comment on neatness and spelling, rather than looking for flaws in an argument. Many teachers avoid projects because they have not developed the pedagogical skills necessary to mentor students, deal with the uncertainties of contemporary science dilemmas, or design researchable questions. Research shows that computer learning environments can make complex projects more successful by scaffolding inquiry, providing help and hints, and freeing teachers to interact with students about complex science issues (Feurzeig and Roberts 1999, Linn and Hsi 2000, White and Frederiksen 1998).
of frog deformities, evaluate the water quality in local streams, and design houses for desert climates. Research demonstrates benefits of asking students to make their thinking visible in predictions, reflections, assessments of their progress, and collaborative debate. Technological learning environments also make student ideas visible with embedded performance assessments that capture ability to critique arguments, make predictions, and reach conclusions. Such assessments help teachers and researchers identify how best to improve innovations and provide an alternative to high stakes assessments (Heubert and Hauser 1998, Bransford et al. 1999).
2.3 Helping Students Learn from Others When students collaboratively investigate science problems, they can prompt each other to reflect, provide explanations in the language of their peers, negotiate norms for critiques of arguments, and specialize in specific aspects of the problem. Several programs, including Kids as Global Scientists (http:\\www.onesky.umich.edu\) and Project Globe (http:\\www.globe.gov\), orchestrate contributions from students around the world to track and compare extreme weather. Technological learning environments support global collaborations as well as collaborative debate and equitable discussion. In collaborative debate (see Science Controversies On-Line: Partnerships in Education, SCOPE, http:\\scope.educ.washington.edu\), students research contemporary controversies on the Internet with guidance from a learning environment, prepare their arguments often using visual argument representations, and participate in a classroom debate where every student composes a question for each presenter. Teachers have a rich sample of student work to use for assessment. Online scientific discussions engage many more students than do class discussions and also elicit more thoughtful contributions (Linn and Hsi 2000).
2.4 Promoting Autonomy and Lifelong Learning 2.2 Making Thinking Visible Students, teachers, and technological tools can make thinking visible to model the process of knowledge integration, illustrate complex ideas, and motivate critical analysis of complex situations. Learning environments, such as WorldWatcher (http:\\www. worldwatcher.nwu.edu\), Scientists in Action (http:\\ peabody.Vanderbilt.edu), and the Web-Based Integrated Science Environment (WISE—http:\\wise. berkeley.edu) guide students in complex inquiry and make thinking visible with scientific visualizations. In these projects, students successfully debate the causes
New pedagogical practices often implemented in computer learning environments can nudge students towards deliberate, self-guided learning. Projects with personally-relevant themes capitalize on the intentions of students and motivate students to revisit ideas after science class is over. For example, students who studied deformed frogs brought news articles to their teacher years after the unit was completed. Projects can offer students a rich context, such as the rescue of an endangered animal or the analysis of the earthquake safety of their school, that raises investment in science learning. 13671
Science Education Students need to monitor their own learning to deal with new sources of science information, such as the Internet, where persuasive messages regularly appear along with public service announcements. Helping students jointly form partnerships where multiple forms of expertise are represented, develop common norms and criteria for evaluating arguments, and deliberately review their progress prepares for situations likely to occur later in life.
3. Research Methods Researchers have responded to the intricate complexities in science education with new research methods informed by practices in other design sciences, including medicine and engineering. In the design sciences, researchers create innovations like science curricula, drugs, or machines and study these innovations in complex settings. In education, these innovations, such as technology enhanced science projects, build on increased understanding of science learning. Research in a design science is typically directed by a multi-disciplinary partnership. In education, partners bring expertise in a broad range of aspects of learning and instruction, including technology, pedagogy, the science disciplines, professional development, classroom activity structures, and educational policy; collaborators often have to overcome perceptions of status differences among the fields represented. By working in partnership, individuals with diverse forms of expertise can jointly contribute to each others’ professional development. Practices for design studies come from design experiments and Japanese lesson study (Brown 1992, diSessa 2000, Lewis 1995). Design studies typically start when the partnership creates an innovation, such as a new learning environment, curriculum, or assessment and have as their goal the continuous improvement of the innovation. The inspiration for innovative designs can come from laboratory investigations, prior successes, spontaneous ideas, or new technologies. Many technology-enhanced innovations have incorporated elements that have been successful in laboratory studies, like one-on-one tutoring or collaborative learning. Other innovations start with scientific technologies such as real time data collection (Bransford et al. 1999). In design studies, partners co-design innovations and evaluations following the same philosophy so that assessments are sensitive to the goals of instruction. Partners often complain that standardized, multiple choice tests fail to tap progress in knowledge integration and complex scientific understanding. Results from assessment allow the partnership to engage in principled refinement of science instruction and can also inform future designers. Often, innovations become more flexible and adaptive as they are refined, 13672
making it easier for new teachers to tailor instruction to their students and curricular goals. Design studies may also be conducted by small groups of teachers engaging in continuous improvement of their instruction, inspired by the Japanese lesson study model. When teachers collaborate to observe each other, provide feedback, and jointly improve their practice, they develop group norms for design reviews. Methodologies for testing innovations in complex settings and interpreting results include an eclectic mix of approaches from a broad range of fields, including classroom observations, video case studies, embedded assessments of student learning, student performance on design reviews, classroom tests, and standardized assessments, as well as longitudinal studies of students and teachers. When partnerships design rubrics for interpreting student work they develop a common perspective. Design study research has just begun to address the complexities of science education. For example, current investigations reveal wide variation among teachers implementing science projects. In classrooms, some teachers spend up to five minutes with each group of students, while others spend less than one minute per group. In addition, some teachers, after speaking to a few small groups, recognize a common issue and communicate it to their whole class. This practice, while demanding for teachers, has substantial benefits. Developing sensitivity to student dilemmas when complex projects are underway requires the same process of knowledge integration described above and may only emerge in the second and subsequent uses of the project. The design study approach to science instruction succeeds when teachers and schools commit to multiple refinements of the same instructional activities, rather than reassigning teachers and selecting new curriculum materials annually.
4. Emerging Research Topics and Next Steps Emerging areas for science education research include professional development and policy studies. The interpretive, cultural, and deliberate character of learning applies equally to these fields. Teachers may value lifelong science learning but have little experience connecting the science in the curriculum to their own spontaneous ideas and few insights into how to support this process in students. Teachers taking a knowledge integration approach to instruction face complex questions such as whether to introduce genetics by emphasizing genotypes, phenotypes, the human genome, or a treatment perspective. They may discover that some students come to class believing that two unrelated people who look alike could be twins. Effective professional development should help teachers with these sorts of specific questions rather than providing general
Science Funding: Asia grounding in the science discipline or glitzy experiments that confuse rather than inform students. Inspired by the Japanese lesson study approach, more and more research groups are convening and studying collaborative groups of science teachers who face similar instructional decisions. Like science students, science teachers need opportunities and encouragement to build on their spontaneous ideas about learning, instruction, and the nature of science to become proficient in guiding their own professional development. Science policy-makers may also be viewed through this lens of interpretive, cultural, and deliberate science learning. Policy-makers frequently hold ideas about science learning that might be at odds with those held by teachers. The status differences between policymakers, natural scientists, and classroom teachers can interfere with open and effective communication. Building a cohesive perspective on pedagogy, science teaching, and science learning has proven very difficult in almost every country. The popularity of high-stakes assessments that might be insensitive to innovation underscores the dilemma. Recent news reports that high-stakes assessments are motivating absenteeism, cheating, and unpromising teaching practices increases the problem (Heubert and Hauser 1998). Science educators face many complex, pressing problems including the connection between science and technology literacy; the dual goals of excellence and equitable access to science careers; the tradeoffs between a focus on public understanding of science and career preparation; the role and interpretation of high stakes tests; and the challenges of balancing the number of topics in the curriculum with the advantages of science project work. If we form a global partnership for science education and jointly develop a cohesive research program on lifelong learning, we have an unprecedented opportunity to collaboratively design and continuously improve teaching, instruction, and learning. See also: Discovery Learning, Cognitive Psychology of; Gender and School Learning: Mathematics and Science; Scientific Concepts: Development in Children; Scientific Reasoning and Discovery, Cognitive Psychology of; Teaching and Learning in the Classroom; Teaching for Thinking
Bibliography Bransford J, Brown A L, Cocking R R, National Research Council (US) 1999 How People Learn: Brain, Mind, Experience, and School. National Academy Press, Washington DC Brown A 1992 Design experiments: Theoretical and methodological challenges in creating complex interventions in classroom settings. Journal of Learning Sciences 2(2): 141–78 Bruer J T 1993 Schools for Thought: A Science of Learning in the Classroom. MIT Press, Cambridge, MA
Clement J 1991 Non-formal Reasoning in Science: The Use of Analogies, Extreme Cases, and Physical Intuition. Lawrence Erlbaum Associates, Mahwah, NJ diSessa A 2000 Changing Minds: Computers, Learning, and Literacy. MIT Press, Cambridge, MA Driver R, Leach J, Millar R, Scott P 1996 Young People’s Images of Science. Open University Press, Buckingham, UK Feurzeig W, Roberts N (eds.) 1999 Modeling and Simulation in Science and Mathematics Education. Springer, New York Heubert J, Hauser R (eds.) 1998 High Stakes: Testing for Tracking, Promotion, and Graduation. National Academic Press, Washington, DC Keller E F 1983 Gender and Science. W H Freeman, San Francisco, CA Lewis C 1995 Educating Hearts and Minds: Reflections on Japanese Preschool and Elementary Education. Cambridge University Press, New York Linn M C, Hsi S 2000 Computers, Teachers, Peers: Science Learning Partners. Lawrence Erlbaum Associates, Mahwah, NJ Piaget J 1971 Structuralism. Harper & Row, New York Pfundt H, Duit R 1991 Students’ Alternatie Frameworks, 3rd edn. Institute for Science Education at the University of Kiel\Institut fu$ r die Pa$ dagogik der Naturwissenschaften. Kiel, Germany Schmidt W H, Raizen S A, Britton E D, Bianchi L J, Wolfe R G 1997 Many Visions, Many Aims: A Cross-national Inestigation of Curricular Intentions in School Science. Kluwer, Dordrecht, Germany Vygotsky L S 1962 Thought and Language. MIT Press, Cambridge, MA Wellesley College Center for Research on Women 1992 How Schools Short Change Girls. American Association of University Women Education Foundation, Washington, DC White B Y, Frederiksen J R 1998 Inquiry, modeling, and metacognition: Making science accessible to all students. Cognition and Instruction 16(1): 3–118
M. C. Linn
Science Funding: Asia Science funding refers to national expenditure, from both public and private sources, for the institutionalization and promotion of a variety of scientific activities conventionally termed research and development (R&D). These may take the form of basic, applied, and development research undertaken or sponsored across a range of science and technology (S&T) institutions in national S&T innovation systems. In the postwar era, the concept of science funding assumed considerable importance in the national policies of Asian countries. The government, both as a patron and as a powerful mediator, played a significant part in shaping the structure and direction of science funding. In the latter half of the twentieth century, the belief in science as a symbol of ‘progress’ was transformed into an established policy doctrine in the Asian region. 13673
Science Funding: Asia Creating wealth from knowledge and achieving social, political, and military objectives came to be closely associated with deliberately fostering S&T activities through funding scientific research. In the mid 1990s, the Asian region accounted for 26.9 percent of the world’s gross expenditure on research and development (GERD), which was US$470 billions in 1994. While Japan and newly industrializing countries (NICs) accounted for 18.6 percent, South East Asia, China, India, and other South Asian countries accounted for 0.9 percent, 4.9 percent, 2.2 percent, and 0.3 percent respectively. Within the Asian region, whereas Japan and NICs accounted for about 69 percent, other countries accounted for 31 percent of total science funding. Three main interrelated approaches underlie the concept of science funding in Asian countries. The first approach underlines the importance and essential nature of public or government funding of R&D as a ‘public good’ that improves the general interest and welfare of the society or nation as a whole. The public good approach emphasizes the funding and generation of knowledge that is non-competitive, open for public access, and non-exclusive. As private firms and the market are unable to capture all of the benefits of scientific research, they often tend to under-invest in R&D, which is also a key component of the process of innovation. It is in this context that public funding becomes important by bearing the social costs of generating scientific knowledge or information. Further, numerous studies have shown that public funding of scientific research as a public good yields several social and economic benefits (see Arrow 1962, Pavitt 1991). In contrast to institutionalized and ‘codified’ forms of knowledge, scientific capacity in the form of ‘tacit knowledge’ is seen to be embodied in research personnel trained over a period of time. Thus, publicly supported training in higher educational settings and networks of professional associations and academies forms an important component of science funding. A closely related second approach to science funding is state sponsorship of military and defense related strategic R&D. Even though there is considerable spinoff from military R&D to the non-military sectors of economy, the main rationale for the importance given to such research is national security. Most countries in the world, including those in the Asian region, spend more money on military and defenserelated scientific research activities than in the civilian R&D domain. The third approach essentially recognizes the importance of private sources of funding for scientific research. Although private patrons have always played an important part in supporting S&T-related research activities, the postwar decades, particularly the 1980s and 1990s, witnessed a remarkable shift from public to private sources of funding of science. The significance of private science funding has grown since the 1980s, as the ‘linear model,’ which was based 13674
Table 1 GERD by source of funding from the late 1970s (A) to the mid-1990s (B) Govt A B Japan NICs SE Asia India China OSA
30 43 86 89 100 85
22 33 58 82 82 78
Private industry A B 70 52 14 12 — 6
67 63 27 16 15 8
Other nat. sources A B — 3 — — — 7
10 4 15 1 3 14
Source: World Science Report, Unesco, Paris, 1998; CAST ASIA II, 22–30 March 1982. OSA: Other South Asian Countries include Bangladesh, Nepal, Myanmar, Pakistan, Sri Lanka, and Mongolia. NICS include South Korea, Taiwan, Singapore, Hong Kong. South East Asia includes Thailand, Philippines, Malaysia, Indonesia, and Vietnam.
on the perceived primacy of basic research, began to lose its credibility. Insights from studies on the economics of innovation came to substantiate the view that it is ‘downstream’ development spending in the spectrum of R&D ‘that plays a crucial role in determining who gets to capture the potential rents generated by scientific research’ (Rosenberg 1991, p. 345). Private industrial firms, rather than government funding, dominate this segment of R&D in Asia as in other regions. Second the importance of private industry as a source of science funding attracted considerable attention from the mid 1980s, as the success of Japan and other East Asian NICs came to be analyzed from the perspective of how these countries transferred the burden of science funding from public to private sources. As the expenditure on military R&D is largely met by governments, the pattern of GERD in the Asian region can be explored in terms of public and private sources. As Table 1 shows, three broad subregions are discernible in the Asian region insofar as the source of science funding is concerned. While private industry accounts for approximately two-thirds, government accounts for one-third of the total R&D expenditure in the case of Japan and NICs. There has been a significant transformation in the increase of private funding of science in South Korea. Private sources accounted for about 2.3 percent of GDP, that is, 80 percent of total R&D funds, which is one of the highest levels in the world. In Japan, Singapore, and Taiwan, over 65 percent of total R&D spending is financed by private industry. Much of this transformation is the result of state mediation and institutional mechanisms which offer appropriate incentives for the private sector to invest in R&D. In the second subregion of South East Asia, there has been a perceptible change in private industry’s share of total R&D, which increased from 14 percent to 27 percent, whereas the government’s share witness-
Science Funding: Asia Table 2 GERD and military expenditure as a percentage of GDP GERD 1994 Japan NICs SE Asia India China Other Asian countries
3n02 1n60 0n50 0n80 0n50 0n38
Military expenditure 1985 1995 1n00 5n45 2n95 3n50 4n90 4n60
1n00 4n05 2n20 2n40 2n30 3n50
Source: World Science Report, Unesco, Paris, 1998; World Development Reports 1998\1999; and 1999\2000.
ed a decline from 86 percent to 58 percent between the late 1970s and the mid-1990s. In Southern Asia, which comprises China, India, and other South Asian countries, the government continues to be the main patron for science funding. The private sector accounts for less than 16.5 percent. In China and India, while the proportion of private funding of R&D increased to 15 percent and 4.4 percent respectively during the late 1970s and the mid-1990s, the private sector is likely to play a more dominant role in the 2000s. This is because these countries are witnessing considerable foreign direct investment and economic restructuring which are fostering liberalization and privatization. Except in the case of the Philippines and Sri Lanka, where about a quarter of total R&D funding is met by foreign sources, foreign funding hardly plays a significant role in the Asian region as whole. Although there is no established correlation between GERD as a percentage of GDP and economic growth rates, this indicator has assumed great significance in S&T policy discourse in the 1990s. Whereas the industrially advanced countries spend between 2 percent to 3 percent of GDP on R&D activities, middle-income countries such as the NICs and some others in the South East Asian region spend over 1.5 percent. The poorer developing countries spend less than 1 percent and the countries falling at the bottom of economic and human development rank index spend even less than 0.5 percent. This general trend is
also largely borne out in Asia, as shown in Table 2. However, the relation between different types or forms of R&D and economic growth has also come into sharper focus. Even though the of share private industry in overall R&D effort is emerging as one of the determinants of economic dynamism, some studies draw attention to various national technological activities (in relation to the national science base) which have a direct bearing on national economic performance measured in terms of productivity and export growth (see Pavitt 1998, p. 800). For instance, as shown in Table 3, the most economically dynamic NICs such as Taiwan and South Korea began with relatively weak scientific strength, as did India in the early 1980s. Nevertheless, they outperformed India by the early 1990s in terms of the change in their share of world publications, as well as in the share of registered patents in USA. Taiwan and South Korea not only spent a much higher proportion on R&D as a percentage of GDP, but the privately dominated R&D structure was such that it gave high priority to patenting. In actual terms, while Japan and South Korea filed 335,061 and 68,446 patents in their respective countries, India filed only 1545 patents in 1995–6. Further, whereas high technology exports as a proportion of total manufacturing exports in Japan, South Korea, Hong Kong, and Singapore registered 39 percent, 39 percent, 27 percent, and 71 percent respectively, India registered hardly 10 percent in 1996, which is much lower than the figure for China (21 percent) for the same year. As indicated in Table 2, there is a good deal of military burden in most of the Asian countries. Even though there was drastic reduction in military expenditure as a percentage of GDP between 1985 and 1995 in the region, all the Asian countries and subregions still spend three to four times more on military activities than on civilian R&D. While Pakistan maintained over 6 percent, Sri Lanka increased its military expenditure as a percentage of GDP from 2.9 percent to 4.6 percent between 1985 and 1995. Japan is the only country in the Asian region that limited its military expenditure to 1 percent of GDP while spending over three times this figure on civilian R&D (3.1 percent) during the same period.
Table 3 Trends in scientific and technological performance in selected Asian countries
County
Change in share of world publications, 1993\1982
Change in share of US patents, 1993\1983
Publications per million population, 1980–84
5n97 5n45 3n53 2n37 0n83
12n81 29n79 03n20 02n42 02n45
23.3 8.00 71.6 45.90 18.10
Taiwan South Korea Singapore Hong Kong India Source: As given in Pavitt (1998, p. 800).
13675
Science Funding: Asia In terms of the sectoral focus of R&D funding in the Asian region, a contrasting picture emerged. Japan and NICs directed their science funding to high technologies, capital, exports of engineering goods, advanced materials, modern biology, and basic research in information and communication technologies. In the 1990s, Japan, South Korea, and Singapore registered greater proportions of private sector funding than even Germany and the USA. Japan’s R&D expenditures have increased eightfold over the 20 year period from 1971 to 1993, which is the highest rate of increase among the industrially developed nations. At the other extreme are the Southern Asian countries, including China, India, Pakistan, Bangladesh, Nepal, Myanmar, and Sri Lanka, where R&D funding related to agriculture and to the manufacturing sector assumed equal importance, as over 60 percent of their populations was dependent on agriculture. Another revealing feature about countries such as China, India, and Pakistan was the importance given to military and strategic R&D, which consumed 45 to 55 percent of the total R&D budget. In most of the South East Asian countries, the contribution of agriculture to the GDP witnessed a considerable decline in contrast to Southern Asian countries. In terms of sectoral contribution to the GDP, none of the South East Asian countries accounted for more than 26 percent (the figure for Vietnam) for agriculture in 1998. The manufacturing and service sectors witnessed unprecedented growth rates between 1980 and 1998. Though these countries spend no more than 0.5 percent of GDP on R&D, the thrust of science funding is directed to manufacturing, industrial, and service-related activities. Agricultural research consumes only 5–10 percent of the total R&D funding, much less than in the Southern Asian countries. See also: Infrastructure: Social\Behavioral Research (Japan and Korea); Research and Development in Organizations; Research Ethics: Research; Research Funding: Ethical Aspects; Science, Economics of; Science Funding: United States; Science, Social Organization of; Scientific Academies in Asia; Universities and Science and Technology: Europe; Universities and Science and Technology: United States
Bibliography Arrow K 1962 Economic welfare and the allocation of resources for invention. In: Nelson R (ed.) The Rate and Direction of Inentie Actiities. Princeton University Press, Princeton, NJ, pp. 609–25 Pavitt K 1991 What makes basic research economically useful. Research Policy 20, pp. 109–19 Pavitt K 1998 The social shaping of the national science base. Research Policy 27, pp. 793–805 Rosenberg N 1991 Critical issues in science policy research. Science and Public Policy 18, pp. 335–46
13676
UNESCO 1998 World Science Report. Paris World Development Report 1998\1999 World Development Report 1999\2000
V. V. Krishna
Science Funding: Europe European governments invest considerable sums of money in science. This article examines the reasons why they do this, covering briefly the historical context of European science funding and highlighting current issues of concern. The focus is on government funding of science, rather than funding by industry or charities, since government has historically been the largest funder of ‘science’ as opposed to ‘technology.’ As an approximate starting point, ‘science’ refers to research that is undertaken to extend and deepen knowledge rather than to produce specific technological results, although the usefulness of this distinction will be questioned below. By ‘science policy’ what is meant is the set of objectives, institutions, and mechanisms for allocating funds to scientific research and for using the results of science for general social and political objectives (Salomon 1977). ‘Europe’ here only refers to Western Europe within the European Union, excluding the Eastern European countries.
1. Background Government funding of science in Europe started in a form that would be recognizable today only after World War II, although relations between science and the state can be traced back at least as far as the Scientific Revolution in the seventeenth century (Elzinga and Jamison 1995). The history of science funding in Europe can be summarized broadly as a movement from a period of relative autonomy for scientists in the postwar period, through stages of increasing pressures for accountability and relevance, resulting in the situation today, where the majority of scientists are encouraged to direct their research towards areas that will have some socially or industrially relevant outcome. However, this account is too simplistic. Many current concerns about science funding are based on the idea that ‘pure’ or ‘basic’ science (autonomous research concerned with questions internal to the discipline) is being sacrificed in place of ‘applied’ research (directed research concerned with a practical outcome), incorrectly assuming that there is an unproblematic distinction between the two (see Stokes 1997). Looking back, it can be seen that even in the late 1950s there were expectations that science should provide practical outcomes in terms of economic and social benefits, and the work that scientists were doing
Science Funding: Europe at this time was not completely ‘pure,’ because much of it was driven by Cold War objectives. This is a demonstration of the broader point that in science policy, the categories used to describe different types of research are problematic, and one must be careful when using the traditional terminology. With these caveats in place, it is possible to trace the major influences on European science funding. In the 1950s and 1960s, much of the technologically oriented funding of research was driven by military objectives and attempts to develop nuclear energy. In terms of science funding, this was a period of institutional development and expansion in science policy (Salomon 1977). The autonomy that scientists enjoyed at this time was based on the assumption that good science would spontaneously generate benefits. Polanyi (1962) laid out the classic argument to support this position, describing a self-governing ‘Republic of Science.’ He argued that because of the essential unpredictability of scientific research, government attempts to direct science would be counterproductive because they would suppress the benefits that might otherwise arise from undirected research. This influential piece can be seen as a response to Bernal’s work (1939), which was partly influenced by the Soviet system and which argued that science should be centrally planned for the social good. Another important concept of the time was the ‘linear model’ propounded by US science adviser Vannevar Bush (1945). In this model for justifying the funding of science, a one-way conceptual line was drawn leading from basic research to applied research to technological innovation, implying that the funding of basic research would ultimately result in benefits that would be useful to society. But pressures on science from the rest of society were increasing. In the 1970s, there was a growing awareness of environmental problems (often themselves the results of scientific and technological developments), and European countries experienced the oil crises, with accompanying fiscal restrictions. There were increasing pressures on scientists to be accountable for the money they were spending on research. Also at this time the social sciences, especially economics, provided new methods for understanding the role of scientific research in industrial innovation and economic growth (see Freeman 1974). In the 1980s Europe realized it had to respond to the technological and economic challenges of Japan and the US, and because of the ending of the Cold War, military incentives for funding science were no longer so pressing. Technology, industrial innovation, and competitiveness were now the main reasons for governments to fund science. Academic studies of innovation also began to question the linear model of the relationship between science and technology, described above, arguing that the process was actually more complicated (e.g., Mowery and Rosenberg 1989). This led to pressures on the previous ‘contract’
between government and scientists (Guston and Keniston 1994), which had been based on the assumptions of the linear model. Rather than presuming that science would provide unspecified benefits at some unspecified future time, there were greater and more specific expectations of scientists in return for public funding. This was accompanied by reductions in the growth of science budgets, producing a ‘steady state’ climate for scientific research, where funding was not keeping up with the rapid pace at which research was growing (see Ziman 1994). Science policy work at this time produced tools and data for measuring and assessing science. Various techniques were developed, such as technology assessment, research evaluation, technology management, indicator-based analysis, and foresight (Irvine and Martin 1984). In the 1990s, there was greater recognition of the importance of scientific research for innovation, with the development of new hi-tech industries that relied on fundamental scientific developments (such as biotechnology), in conjunction with other advanced technologies. There were also growing pressures for research to be relevant to social needs. Gibbons et al. (1994) argued that the 1990s have witnessed an increasing emphasis on problem-oriented, multidisciplinary research, with knowledge production having spread out to many diverse locations, and that distinctions between basic and applied science, and between science and technology, are becoming much more difficult to make.
2. The Influence of the European Union Moving from a general historical context to look more specifically at the European level shows that research funding from the European Union (EU), in the form that it currently takes, did not start until 1984 with the first ‘Framework Programme.’ This funded pre-competitive research (i.e., research that is still some way from market commercialization) following an agenda influenced by industrial needs (Sharp 1997). From the 1960s onward, the Organization for Economic Cooperation and Development (OECD) had been a more influential multinational organization than the EU in terms of national science policies (Salomon 1977). In particular, the OECD enabled countries to compare their research activities with those of other countries, and encouraged greater uniformity across nations. Currently EU research funding only comprises a few percent of the total research funding of all the member states (European Commission 1994), although it has been more important in the ‘less favored’ regions of Europe (Peterson and Sharp 1998). Consequently, in terms of science funding, the national sources are more important than the EU. However, EU programs do have an influence on the funding priorities of national governments. In theory, the EU 13677
Science Funding: Europe does not fund research that is better funded by nation states (according to the ‘principle of subsidiarity,’ see Sharp 1997), so it does not fund much basic research, but is primarily involved in funding research that is directed towards social or industrial needs. The most important impact of the EU has been in stimulating international collaboration and helping to form new networks, encouraging the spread of skills. One of the requirements of EU funded projects is that they must involve researchers from at least two countries (Sharp 1997). This could be seen as part of a wider political project that is helping to bind Europe together. It is possible that many of these collaborations might have happened without European encouragement because of a steady rise in all international collaborations (Narin et al. 1991). However, it is likely that through its collaborative programs and their influence, the EU will play an increasingly important role in the future of research funding in the member countries (Senker 1999).
3. Indiidual Countries in Europe Since it is the individual countries in Europe that are responsible for the majority of science funding, the organization of their research systems deserves attention. All the countries have shown the general trends outlined above, but the historical and cultural differences among the European nations lead to considerable diversity in science funding arrangements. It is possible to compare the different countries by looking at the reasons why they fund science and the ways in which research is organized. European nations, like those elsewhere, have traditionally funded science to encourage economic development, although most countries also attach importance to advancing knowledge for its own sake. Some countries such as Sweden and Germany have emphasized the advancement of knowledge, and other countries, such as Ireland, have put more emphasis on economic development (Senker 1999). Since the 1980s, the economically important role of science has been emphasized in every country. This has often been reflected at an organizational level with the integration of ministerial responsibilities for science funding with those for technology and higher education. We can compare individual countries in terms of differences in the motivations behind funding research. Governments in France and Italy have traditionally promoted ‘prestige’ research, and have funded large technology projects, such as nuclear energy. These reasons for funding research, even though they are less significant in the present climate, have had longlasting effects on the organization of the national research systems. The UK is notable in that the importance of science for economic competitiveness is 13678
emphasized more than in other European countries, and industrial concerns have played a larger role (Rip 1996). Organizational differences between countries can tell us something about the way research funding is conceptualized and can reflect national attitudes toward the autonomy and accountability of researchers. In the different European countries, the locus of scientific research varies. In some countries, the universities are most important (e.g., Scandinavia, Netherlands, UK), and funds are competed for from research councils (institutions that mediate between scientists and the state, see Rip 1996). In this type of system there will usually be some additional university funding that provides the infrastructure, and some of the salaries. The level of this funding varies between countries, which results in differences in scientists’ dependence on securing research council funds and has implications for researcher autonomy. In other countries, a great deal of scientific research is carried out in institutions that are separate from the universities (e.g., France and Italy). The situation is not static, and scientific research in the university sector has been growing in importance across the whole of Europe (Senker 1999). For example, in France, the science funding system has traditionally been centralized with most research carried out in the laboratories of the Centre National de la Recherche Scientifique (CNRS). Now the situation is changing and universities are becoming more involved in the running of CNRS labs, because universities are perceived to be more flexible and responsive to user needs (Senker 1999). Germany is an interesting case because there is a diversity of institutions involved in the funding of science. There is a division of responsibility between the federal state and the La$ nder, which are responsible for the universities. There are also several other types of research institute, including the Max Planck institutes, which do basic research, and the more technologically-oriented Fraunhofer institutes. Resulting institutional distinctions between different types of research may lead to rigidities in the system (Rip 1996). In all countries in Europe, there is an attempt to increase coordination between different parts of the national research system (Senker 1999).
4. Current Trends As has been emphasized throughout, European governments have demanded increasing relevance of scientific results and accountability from scientists in return for funding research. Although the situation is complex, it is clear that these pressures, and especially the rhetoric surrounding them, increased significantly during the 1990s. This has led to worries about the place for serendipitous research in a ‘utilitarian– instrumental’ climate (Nowotny 1997, p. 87).
Science Funding: Europe These pressures on science to be useful are not the only notable feature of the current funding situation. The views of the public are also becoming more important in decisions concerning the funding of science. The risks and detrimental effects of science are of particular concern to the public, possibly because the legitimacy of the authority of politicians and scientists is being gradually eroded (Irwin and Wynne 1996). Throughout Europe, there has been a growth in public distrust in technological developments, which has led to pressures for wider participation in the scientific process. This is related to the current (and somewhat desperate) emphasis on the ‘public understanding of science,’ which is no longer simply about educating the public in scientific matters, but has moved towards increasing participation in the scientific process (see Gregory and Miller 1998). Concerns about the environmental effects of scientific developments can be traced back to the 1960s, but recent incidents in the 1980s and 1990s have led to a more radical diminution of public faith in scientific experts (with issues such as climate change, Chernobyl, and genetically modified foods). The public distrust of science may also be due to the fact that scientists, by linking their work more closely either to industrial needs or to priorities set by government, are losing their previously autonomous and potentially critical vantage point in relation to both industry and government. Certain European countries, especially the Netherlands and Scandinavia, which have a tradition of public participation, are involving the public more in debates and priority setting on scientific and technological issues. This has been described as a ‘postmodern’ research system (Rip 1996). As the distinction between science and technology becomes less clear, in this type of research system there is also a blurring of boundaries between science and society.
5. Implications An implication of these changes in science funding is that the growing importance of accountability and of the role of the public in scientific decisions may have epistemological effects on the science itself, since scientific standards will be determined not only by the scientific community but by a wider body of actors often with divergent interests (Funtowicz and Ravetz 1993). If norms are linked to institutions, and if institutions are changing because of the greater involvement of external actors in science, and of science in other arenas, then the norms may be changing too (Elzinga 1997). This is an issue that was touched on in the 1970s and 1980s by the proponents of the ‘finalization thesis’ who argued that, as scientific disciplines become more mature, they become more amenable to external
steering (Bo$ hme et al. 1983). The importance of external influences leads to worries about threats to traditional values of what constitutes ‘good’ science (Elzinga 1997, see also Guston and Keniston 1994 for US parallels). There may be an emergence of new standards of evaluation of scientific research. European science funding has changed considerably since it was institutionalized, partly because of its success in generating new technologies and partly because of its failures and their social consequences. It is becoming more difficult to categorize science, technology, and society as separate entities (Jasanoff et al. 1995), or to think of pure scientists as different from those doing applied research. Wider society has become inextricably linked with the progress of science and the demands placed on scientists and sciencefunding mechanisms are starting to reflect this restructuring. This tendency is likely to continue into the future. See also: Infrastructure: Social\Behavioral Research (Western Europe); Kuhn, Thomas S (1922–96); Research and Development in Organizations; Research Funding: Ethical Aspects; Science, Economics of; Science Funding: Asia; Science Funding: United States; Science, Social Organization of; Universities and Science and Technology: Europe
Bibliography Bernal J D 1939 The Social Function of Science. Routledge, London Bo$ hme G, Van Den Daele W, Hohlfeld R, Krohn W, Scha$ fer W 1983 Finalization in Science: The Social Orientation of Scientific Progress. Reidel, Dordrecht, The Netherlands Bush V 1945 Science: The Endless Frontier. USGPO, Washington, DC Elzinga A 1997 The science–society contract in historical transformation. Social Science Information 36: 411–45 Elzinga A, Jamison A 1995 Changing policy agendas. In: Jasanoff S, Markle G E, Petersen J, Pinch T (eds.) Handbook of Science and Technology Studies. Sage, Thousand Oaks, CA European Commission 1994 The European Report on Science and Technology Indicators. European Commission, Luxembourg Freeman C 1974 The Economics of Industrial Innoation. Penguin, Harmondsworth, UK Funtowicz S O, Ravetz J R 1993 Science for the post-normal age. Futures 25: 739–56 Gibbons M, Limoges C, Nowotny H, Schwartzman S, Scott P, Trow M 1994 The New Production of Knowledge. Sage, London Gregory J, Miller S 1998 Science in Public: Communication, Culture and Credibility. Plenum, New York Guston D, Keniston K 1994 The Fragile Contract: Uniersity Science and the Federal Goernment. MIT Press, London Irvine J, Martin B 1984 Foresight in Science: Picking the Winners. Pinter, London Irwin A, Wynne B (eds.) 1996 Misunderstanding Science? The Public Reconstruction of Science and Technology. Cambridge University Press, Cambridge, UK
13679
Science Funding: Europe Jasanoff S, Markle G E, Petersen J, Pinch T (eds.) 1995 Handbook of Science and Technology Studies. Sage, Thousand Oaks, CA Mowery D, Rosenberg N 1989 Technology and the Pursuit of Economic Growth. Cambridge University Press, Cambridge, UK Narin F, Stevens K, Whitlow E 1991 Scientific co-operation in Europe and the citation of multinationally authored papers. Scientometrics 21: 313–23 Nowotny H 1997 New societal demands. In: Barre R, Gibbons M, Maddox J, Martin B, Papon P (eds.) Science in Tomorrow’s Europe. Economica International, Paris Peterson J, Sharp M 1998 Technology Policy in the European Union. Macmillan, Basingstoke, UK Polanyi M 1962 The republic of science: its political and economic theory. Minera 1: 54–73 Rip A 1996 The post-modern research system. Science and Public Policy 23: 343–52 Salomon J J 1977 Science policy studies and the development of science policy. In: Spiegel-Ro$ sing I, Solla Price D (eds.) Science, Technology and Society: A Cross-disciplinary Perspectie. Sage, London Senker J 1999 European Comparison of Public Research Systems. Report prepared for the European Commission. SPRU, Sussex, UK Sharp M 1997 Towards a federal system of science in Europe. In: Barre R, Gibbons M, Maddox J, Martin B, Papon P (eds.) Science in Tomorrow’s Europe. Economica International, Paris Stokes D E 1997 Pasteur’s Quadrant: Basic Science and Technological Innoation. Brookings Institution Press, Washington, DC Ziman J 1994 Prometheus Bound. Cambridge University Press, Cambridge, UK
J. Calvert and B. R. Martin
Science Funding: United States The pursuit of knowledge has historically been shaped by patronage relationships. The earliest scientists were astronomers and mathematicians supported at ancient courts in the Middle East and China. In Renaissance Europe, observers of and experimenters with nature had patrons in the aristocracy or royal families. In nineteenth century America, industrial philanthropists took up their cause. Since World War II, research in the United States has received its support largely from government, with help from industry and private foundations. This article focuses on the postwar history of government funding for research in the United States. That history has been characterized by a creative tension between autonomy and accountability, embodied in successive waves of invention of new institutional arrangements. Should science set its own agenda, or should it answer to the public? Could the two goals be reconciled? These issues have echoed through five decades of research policy. 13680
1. Autonomy for Prosperity In the eighteenth and nineteenth centuries, the US federal government funded research only if it was applied directly to practical goals. The earliest federal efforts were in surveying and geological exploration, activities that contributed both to nation-building and to the search for mineral wealth. A second major wave of federal effort in the latter half of the nineteenth century took up agricultural research in partnership with the States, through land grant colleges and agricultural experiment stations. Scientists were also brought in to solve the immediate problems of wartime. World War I was known as ‘the chemists’ war,’ because of the use of chemical warfare agents. World War II became ‘the physicists’ war,’ with the invention of the atomic bomb. And in the 1930s, a small federal laboratory for health research was established, which later became the National Institutes of Health, the nation’s research arm for biomedical sciences (Dupree 1964, Kevles 1978, Strickland 1989). By the time of World War II, these efforts had grown into a set of research programs in mission agencies, focused on food, health, and defense. These activities were carried out largely in government laboratories, and were focused on immediate, practical goals. In later decades, two other agencies joined this group. The National Aeronautics and Space Administration (NASA) was formed in response to the launching of a Russian satellite, the Sputnik, in 1957 (Logsdon 2000). And the Department of Energy, when it was established in the 1970s, incorporated major research elements, including the high-energy physics laboratories. This array of government research efforts continues to anchor the mission-oriented portion of a pluralistic system of funding for US research. The other dimension of that system is fundamental research. It is anchored in universities, and interwoven with graduate education and the preparation of new generations of researchers. This second dimension has its origins in moments of national crisis, when government research efforts were expanded temporarily through the cooptation of university researchers. After such an interlude in World War I, influential US scientists tried to convince the federal government to keep the research relationship with universities going, but to no avail. It was only after the success of the crucial atomic bomb project that the achievements of science carried enough weight to make credible proposals for permanent, across-the-board government support for research (Kevles 1978). The spokesperson for this plan was Vannevar Bush, a university scientist who had been active in the government projects around the war. He called for the formation of a National Research Foundation, to provide basic support for research, without the specific targets that had been imposed during the war. He argued that unfettered research, carried out largely in
Science Funding: United States
Figure 1 US basic research spending, 1953–1998, total and Federal
universities, would build a base of human and knowledge resources that would help solve problems across the range of challenges in health, defense, and the economy (Bush 1990 [1945]). But to make these contributions, researchers needed freedom—both freedom in the laboratory to choose and pursue research problems, and organizational freedom at the level of government agencies, to set the larger research agenda in directions that were free from political control. Bush’s model of the relationship between science and society has been called the ‘autonomy for prosperity’ model (Cozzens 2001). Most of the experimentation over the next few decades with ways to maintain autonomy while being useful took the form of add-ons to this model. Bush’s model built a protective shell of organizational autonomy around the agencies that provided funds for basic research. The National Science Foundation (NSF) and the National Institutes of Health (NIH), both of which grew very fast in the postwar period, developed strategies that insulated direct decisions around research from the political context. One funding mechanism that embodied the essence of the autonomy-for-prosperity model in its early days was the science development program. Under these
programs in the 1950s and 1960s, federal block grants allowed universities to build their educational and research capacities, with very few strings attached (Drew 1985). The second strategy was the project grant\peer review system of funding. Under this mechanism, the government hands over choice of research topics as well as the judgment of what is to count as quality into the hands of researchers. ‘Peer review’ for project selection became both the major form of quality control in the federal funding system and an important symbol of scientific freedom (Chubin and Hackett 1990). Ironically, the autonomy-protecting institutional shell that Bush and his successors designed undermined the practical effectiveness that had won credibility for his plan to start with. As government funding for research grew rapidly in the 1950s, beyond the scale of Vannevar Bush’s wildest dreams, the institutional shell unintentionally created a protected space where new researchers could do their work without any grounding in the practical problems of the world. As soon as the practical grounding was lost, a gap was created between science and society that needed to be bridged if autonomy was really going to be turned into prosperity. 13681
Science Funding: United States
2. Knowledge Transfer and Knowledge Mandating Among the first set of funding methods developed to bridge this gap were knowledge transfer mechanisms. Knowledge transfer mechanisms do not threaten either laboratory or organizational autonomy, because they leave the protective shell in place while focusing on diffusing or disseminating research-based knowledge. For example, in the 1960s and 1970s, many federal programs addressed the ‘information explosion’ with ‘science information services,’ under the rubric of ‘information policy.’ The government encouraged first journals, then abstracting and indexing services, to provide access to the exploding journal literature, and later added extension services. The emphasis was on providing infrastructure for communication, not shaping its content; science was to speak to the public, not with it. These mechanisms protect research autonomy by segmenting knowledge processes over time. As the size of the research enterprise continued to grow, however, the pendulum swung back and the accountability issue inevitably arose. Elected representatives of the public asked ‘What are we getting for all this spending?’ In the absence of specific answers, a backlash eventually appeared. Policymakers began to demand that research solve societal problems directly, rather than through the diffuse claim of ‘autonomy for prosperity.’ In 1969, for example, Congress passed ‘the Mansfield amendment,’ limiting support from the Department of Defense to goal-directed research. The Nixon administration phased out science development programs. And in the 1970s, several ‘knowledge mandating’ programs appeared. At NSF, for example, the Program of Research Applied to National Needs (RANN) grew out of a Presidential concern about ‘too much research being done for the sake of research.’ (Mogee 1973). At NIH, the ‘War on Cancer’ emerged during the same period (Strickland 1989, Rettig 1977). In their more programmed forms, such programs threatened both individual and organizational autonomy. But even in softer forms, in which money was only redirected from one priority to another, they still threatened organizational autonomy, since external forces were dictating research directions. There is an important lesson in the history of these programs: over time, organizational autonomy won out over societal knowledge mandating. RANN was abolished, and over the years, both NSF and NIH have found ways to look programmatic without doing central planning.
3. Knowledge Sharing By the mid-1970s, a third approach began to be developed in the United States to bridge the gap between the science produced inside the protective 13682
institutional shell and the problems of the world surrounding it. This new form threatened neither the individual autonomy of researchers nor the organizational autonomy of funding organizations. This new form can be called ‘knowledge-sharing.’ This approach first took the form of an emphasis on partnerships, and within that emphasis, an early focus on partnerships with industry. New centers involved industry partners in setting strategic plans. The interchange of people and information became the rule. The critical element in these was a two-way dialog, which replaced the one-way science-to-society diffusion of information of knowledge transfer and the one-way society-to-science mechanism of knowledge mandating. In the two-way dialog, scientists became strategic thinkers, able to formulate their own problems creatively, but steeped again, as in the 1950s, in the problems articulated by some external set of partners. Another new element was the explicit link to education, at both graduate and undergraduate levels. The Engineering Research Centers of the NFS, for example, were intended to produce a ‘new breed’ of engineers, better prepared than previous generations to participate in R&D in industry because they understood the needs and culture of industry. This new partnership model raised questions: Was knowledge mandating being changed from a public function to a private one under this scheme? A change in the law that allowed universities to hold intellectual property rights in results produced under government grants further heightened these concerns. Public knowledge seemed to be on a path to privatization. At the same time, several well-publicized instances of accusations of fraud in science seemed to undermine the trusting relationships among citizens, government, and science (Guston 1999). In the late 1980s, however, a second crucial step in the development of the partnership model took place. This was a step back toward the public. Centers were urged to form partnerships, not only with industry, but also with State and local governments, citizen groups, and schools. The benefits that industry gained from earlier collaborations were now available to other parts of society, in a new pattern that has been called ‘partner pluralism’ (Cozzens 2001). Strategic plans began to be shaped by two-way dialog across many sectors, and the education of a new breed of researcher able to bridge the cultures of university, school, and public service sector began. Another manifestation of knowledge-sharing and an attempt to restore public trust arrived with a round of attention among researchers to public awareness of science and science education. The 1990s heard a new note in the discussions of the public among science leaders. Research leaders stressed that scientists need to learn more about the public, as well as the public learning more about science. Even the venerable National Academy of Sciences recommended that the institutions of science open their doors to the public,
Science, New Forms of urging, for example, that NIH take more advice from the public in priority-setting. These were signs of the beginning of a two-way dialog.
4.
The Knowledge Society
As US research entered the twenty-first century, many believed that knowledge is the key resource in the new economy. For a society to be innovative, however, its creative capacity must be widely shared. This goal will not be achieved unless its scientists become strategic thinkers steeped in society’s problems and issues, and government funding agencies remain public through partner pluralism. Accountability for research in the knowledge society is achieved through the engagement of many societal actors in the research enterprise. By placing many actors on an equal footing and encouraging two-way dialog, research policy in the twenty-first century will help stimulate shared creativity, and open a new path to prosperity. See also: Academy and Society in the United States: Cultural Concerns; Research Funding: Ethical Aspects; Science Funding: Asia; Science Funding: Europe; Science, Social Organization of
Bibliography Bush V 1990 [1945] Science—the Endless Frontier. National Science Foundation, Washington, DC, reprinted from the original Chubin D E, Hackett E J 1990 Peerless Science: Peer Reiew and US Science Policy. State University of New York Press, Albany, New York Cozzens S E 2001 Autonomy and accountability for 21st century science. In: de la Mothe J (ed.) Science, Technology, and Goernance. Pinter, London Drew D E 1985 Strengthening Academic Science. Praeger, New York DuPree A H 1964 Science in the Federal Goernment: A History of Policies and Actiities to 1940. Harper & Row, New York England J M 1983 A Patron for Pure Science: The National Science Foundation’s Formatie Years. National Science Foundation, Washington, DC Guston D H 2000 Between Politics and Science: Assuring the Productiity and Integrity of Research. Cambridge University Press, New York Kevles D J 1978 The Physicists: The History of a Scientific Community in Modern America. Knopf, New York Logsdon J M 1970 The Decision to Go to the Moon: Project Apollo and the National Interest. MIT Press, Cambridge, MA Mogee M E 1973 Public Policy and Organizational Change: The Creation of the RANN Program in the National Science Foundation. MSc. Thesis, George Washington University, Washington, DC Morin A J 1993 Science Policy and Politics. Prentice-Hall, Englewood Cliffs, NJ Mukerji C 1989 A Fragile Power: Scientists and the State. Princeton University Press, Princeton, NJ
Rettig R A 1977 Cancer Crusade: The Story of the National Cancer Act of 1971. Princeton University Press, Princeton, NJ Smith B L R 1989 American Science Policy since World War II. Brookings Institution, Washington, DC Stokes D E 1997 Pasteur’s Quadrant: Basic Science and Technological Innoation. Brookings Institution, Washington, DC Strickland S P 1988 The Story of the NIH Grants Program. University Press of America, Lanham, MD
S. E. Cozzens
Science, New Forms of In recent times, there has been a growing sense that ‘science’ is not merely expanding in its grasp of the natural world, but that the activity itself is changing significantly. For this we use the term ‘new forms of science,’ understanding both the practice and the reflection on it. Since science is now a central cultural symbol, changes in the image of science rank in importance with those in the practice itself. Our situation now is one of the differentiation of a practice that hitherto seemed unified, and of conflict replacing consensus on many issues, some of which had been settled centuries ago. The prospect for the immediate future is for a social activity of science that no longer has a unified core for its self-consciousness and its public image. The main baseline for this ‘novelty’ is roughly the century preceding World War II, when the ‘academic’ mode of scientific activity became dominant. This was characterized by the displacement of invention and diffusion by disciplinary research, and of amateurs (with private means or patronage) by research professionals (perhaps with a partial sinecure in advanced teaching). In the latter part of this period, the mathematical–experimental sciences took the highest prestige away from the descriptive field sciences. The social sciences were increasingly pressured to conform to the dominant image. And throughout, the consciousness of the activity, reflected in scholarly history and philosophy as well as in popularization, was triumphalist and complacent. There could be no way to doubt that the idealistic scientists, exploring nature for its own sake, inevitably make discoveries that eventually enrich all humanity both culturally and materially. Of course, any ‘period’ in a social activity is an intellectual construct. Closer analyses will always reveal change and diversity within it; and it contains both relics of the deeper past and germs of the future. But occasionally there occurs a transforming event, so that with all the continuities in practice there is a discontinuity in consciousness. In the case of recent science, this was the Manhattan Project, in which the first atomic bombs were designed and constructed in a 13683
Science, New Forms of gigantic scientific engineering enterprise. It was of particular symbolic significance that the project was conceived and managed, not by humble ‘applied scientists’ or engineers, but by the elite theoretical physicists. It was science itself that was implicated in the moral ambiguities and political machinations that ensued in the new age of nuclear terror. After the Bomb, it was no longer possible to build scholarly analyses on assumptions of the essential innocence of science and the beneficence of scientists. Prewar studies in the sociology of science, such as those of Merton (with his ‘four norms’), which assumed a purity in the endeavor and its adherents, could later find echoes only in those of the amateur philosopher Polanyi (1951), himself embattled against J. D. Bernal and the Marxist ‘planners’ of science (1939). But it took about a decade for a critical consciousness to start to take shape. Those scientists who shared the guilt and fear about atomic weapons still saw the cure in educating society, as through the Bulletin of the Atomic Scientists, rather than wondering whether science itself had taken a wrong turning somewhere. Leo Szilard (1961) imagined an institute run by wise scientists that would solve the world’s problems; and Jacob Bronowski (1961) faced the problem of the responsibility of science for the Bomb, only to deny it absolutely. The vision of Vannevar Bush (1945), of science as a new and endless frontier for America, continued the prewar perspective into postwar funding. The first significant novelty in the study of science was accomplished in the 1950s by Derek J. de Solla Price (1963). Although his quantitative method seemed at times to be a caricature of science itself, he was submitting scientific production to an objective scrutiny that had hitherto seemed irrelevant or verging on irreverent. And he produced the first ‘limits to growth’ result; with a doubling time (steady over three centuries!) of 15 years, compared to some 70 in the economy, sooner or later society would refuse to foot the bill. Since the prevailing attitude among scientists was that society’s sole contribution to science should be a blank check, Price’s bleak forecast, however vague in its timing, was not well received. He went on to give the first characterization of the new age of science; but he only called it ‘big,’ as opposed to the ‘little’ of bygone ages. A more serious disenchantment motivated Thomas S. Kuhn. He had imbibed the naive progressive image of science, whereby truths were piled on truths, and scientists erred only through bad methods or bad attitudes. Then it all cracked, and he wrote his seminal work on Scientific Reolutions (1962), all the more powerful because of its confusions and unconscious ironies. Whether he actually approved of routinized puzzle-solving ‘normal science’ was not clear for many years; to idealists like Popper, Kuhn’s flat vision of science represented a threat to science and to civilization (1970). 13684
Through the turbulent 1960s, more systematically radical social critiques began to emerge. Of these, the most cautious was that of John Ziman’s Public Knowledge (1968), and the most ambiguous that of Jerome Ravetz (combining Gemeinschaft craftsmen’s science, Nuclear Disarmament, and the Counterculture, but ending his book with a prayer by Francis Bacon) (1996, [1971]). An attempt at a coherent Marxist perspective was mounted by Steven and Hilary Rose (1976). This was enlivened by their internal tension between Old and New Left, but with the complete discrediting of Soviet socialism, it was soon relegated to purely historical significance. A serious attempt at an objective social study of science was made by N. D. Ellis (The Scientific Worker 1969) but there was then no audience for such an approach. Another premature social analysis of science came to light around then. This was the work of Ludwik Fleck (1979 [1935]) on the ‘genesis and development of a scientific fact’; at the time of its publication and for long after, the very title smacked of heresy. At the same time, the new dominant social practice of science, either ‘industrialized’ (Ravetz) or ‘incorporated’ (the Roses) took shape. The prophetic words of President Eisenhower on the corrupting influence of state sponsorship were little heeded; and an American science that was both ‘pure’ and ‘big’ produced its first muckraker, in Dan Greenberg (1967), a Washington correspondent for Science magazine. There was a short but sharp protest at the militarization of science in the Vietnam War (Allen 1970), out of which came various attempts at organizing for ‘social responsibility in science.’ On the ideological front, there was a collapse of attempts to preserve science as the embodiment of the Good and the True. Somewhat too late for the 1960s but revolutionary nonetheless, Paul Feyerabend published Against Method, in which even the rearguard defenses of Popper (1963) and Lakatos (1970) were demolished. Then came the tongue-in-cheek naive anthropology of Laboratory Life (Latour and Woolgar 1977), considering scientists as a special tribe producing ‘inscriptions’ which were intended to gain respect and to rebut hostile criticism. A radically skeptical theory of scientific knowledge was elaborated in sociological studies called ‘constructivism.’ As prewar science receded in time and relevance, scholars could settle down to the analysis of this new form of social practice of science in its own terms. Early in the postwar period there had been a conception that scientists in the public arena should be ‘on tap but not on top,’ a sort of ‘drip’ theory of their function. But more sophisticated analyses began to emerge. Among these was ‘mandated science’ (Salter 1988), in which the pitfalls of well-intentioned involvement were chronicled. The special uses and adaptations of science in the regulatory process and in the advisory function were analyzed (Jasanoff 1990). In such studies, it became clear that neither ‘science’
Science, New Forms of nor ‘scientist’ can be left as elements outside the process; indeed, the old Latin motto ‘who guards the guardians?’ reminds us of the essentially recursive nature of the regulatory process. For (as Jasanoff showed) those who regulate or advise act by certain criteria and are also selected for their position; and who sets the criteria, and who chooses the agents, can determine the whole process before the scientistadvisors even get to work. Meanwhile the existing trends in the social practice of science were intensified. Physics lost the preeminence it had gained through winning the war with the Atomic Bomb. Civil nuclear power (which was really engineering anyway) failed to live up to its early promise, producing not only disasters (near and actual) in practice but also intractable problems in waste management. The core of basic physics research, high-energy studies, was caught in a cul-de-sac of ever more expensive machines hunting ever more esoteric particles. It may also have suffered from its overwhelming reliance, not covert but not widely advertised either, on the US military for its funding. Then physics was caught up in the debate over ‘Star Wars,’ which was controversial not merely on its possible political consequences, but even on the question of whether it could ever work as claimed. In this case, it appeared that the technical criteria of quality were totally outweighed by the political and fiscal. The focus of leading-edge research shifted to biology, initiated by the epochal discovery of the structure of the genetic material DNA, the ‘double helix.’ There was a steady growth in the depth of knowledge and in the power of tools in molecular biology. This was punctuated in the 1970s by an episode of discovery of, and reaction to, possible hazards of research. In an unprecedented move, the ‘recombinant DNA’ research community declared a moratorium on certain experiments, until the hazards had been identified and regulated. Critics of the research interpreted this as an admission that the hazards were real rather than merely hypothetical. There was a brief period (1976–7) of confrontation in the USA, complicated by some traditional town–gown antagonisms and by relics of Vietnam activism. But it soon died down, having served partly to stimulate interest in the research but also planting the seeds of future disputes. Another novelty in this period was the emergence of ethics as a systematic concern of science. Qualms about the uses of science in warfare go back a long way (Descartes 1638 has a significant statement), as well as disputes over priority and honesty. But with the Bomb, the possibility of science being turned to evil ends became very real; and with the loss of innocence in this respect, other sorts of complaints could get a hearing. Medical research is the most vulnerable to criticism, since the subjects are humans (or other sentient beings) and there must be a balance between their personal interests and those of the research (and hence of the
broader community). But some cases went too far, as that of the African-American men who (in the 1930s) were not given treatment for syphilis, so that their degeneration could be recorded for science. Also, there were revelations of some of the more bizarre episodes of military science at the height of the Cold War, such as subjecting unwitting human subjects either to nuclear radiation or to psychotropic drugs. Since some universities shared responsibility with the military for such outrages, the whole of science could not but be tarnished. Out of all this came a problem of the sort that could not have been imagined in the days of little science: activism, to the point of terror tactics, against scientists and labs accused of cruelty to animals. A further loss of innocence was incurred in the realization of the Marxist vision of science as ‘the second derivative of production,’ albeit under conditions that Marxists did not anticipate. Now ‘curiosity-driven,’ or even ‘investigator-initiated’ research, forms a dwindling minority within the whole enterprise. The days when the (British) Medical Research Council could simply ‘back promising chaps’ are long since gone. Increasingly, research is mission oriented, with the missions being defined by the large institutions, public and private, possessing the means and legitimacy to do so. Again, it is in biology, or rather bioengineering, where the change is most remarked. With genetic engineering, science is increasingly caught up in the globalization, or rather commodification, of everything that can be manipulated, be it life-forms or even human personality. When firms are accused of bio-piracy (appropriating plants or genetic materials from local people, and then patenting them), this is not exactly new, since it was also practiced by the illustrious Kew Gardens in Victorian times. But such practices are no longer acceptable in the world community (a useful survey of these issues is to be found in Proctor 1991, Chap. 17). The social organization of science has changed in order to accommodate to these new tasks and circumstances. The rather vague term ‘mode 2’ has been coined to describe this novel situation (Gibbons et al. 1994). In the terms of this analysis of a condition which is still developing, scientists are reduced to proletarians, neither having control over the intellectual property in the products of their work (since it is done largely on contracts with various restrictions on publication), nor even the definable skills of disciplinary science (since most projects fall between traditional boundaries). The discussion in the book provides a useful term for this sort of science: ‘fungible.’ For the research workers become a convertible resource, being shipped from project to project, to be reassigned or discarded as convenient. There are all sorts of tensions within such a new dispensation, perhaps most notably the contradictory position of the educational institutions, whose particular sort of excellence may not be compatible with 13685
Science, New Forms of the demands of the new labor market and research regime. With the fragmentation and globalization of politics, other new sorts of scientific practice are emerging, not continuous with the evolving mainstream institutions but either independent of, or in opposition to, them. The leading environmental organizations have their own scientific staff who debate regularly with those promoting controversial projects. Increasingly, ‘the public’ (or representatives of the more aware and critical sections of it) is drawn into deliberative processes for the evaluation and planning of developments in technology and medicine. In some respects a ‘stakeholder society’ in science is emerging, engaged in debate on issues that had hitherto been left for the experts and politicians to decide. Movements for ‘community research’ are emerging in the USA (with a focus in the Loka Foundation) and elsewhere. This development can be seen as an appropriate response to an emerging new major task for science. In the process of succeeding so brilliantly in both discovery and invention, science has thrown up problems that can be loosely defined as risk. The tasks here have significant differences from those of the traditional scientific or technical problems. For one, problems of risk and safety are inherently complex, involving the natural world, technical systems, people, institutions, values, uncertainties, and ignorance. Isolated, reductionist studies of the ‘normal science’ sort may be an essential component, but cannot determine a policy. Further, ‘safety’ can never be conclusively established, not merely because it is impossible to prove impossibility, but more because the evidence of causation is either only suggestive (as from toxicological experiments with acute doses on non-human species) or indirect (as from epidemiological studies). In the debates on managing risk, methodology becomes politicized, as it is realized that the assignment of burden of proof (reflected in the design of statistical studies and of experiments) can be critical in determining the result that goes forward into the policy process. Conflict of interest becomes difficult to avoid, since most opportunities for achieving expertise are gained through work with the interests promoting, rather than with those criticizing, new developments. Also, the debates take place in public forums, such as the media, activist campaigning, or the courts, each of which produces its characteristic distortions on the process. One response to this increased public involvement has been a variety of recommendations, from respected and even official sources, for greater openness and transparency in the science policy process. In some respects, this is only prudence. In the UK, the controversy over ‘mad cow disease’ (or the BSE catastrophe) could fester for a decade only because of official control of knowledge and ignorance. In its aftermath, the development of genetically engineered food crops was threatened by ‘consumer power’ in 13686
Europe and beyond. Expressed negatively, we can be said to be living in a ‘risk society’ (Beck 1992), where new forms of political activity are fostered by the consequences of an inadequately controlled science– technology system (Sclove 1995, Raffensperger 1999). The assumption of beneficence of that system is now wearing thin; thus the mainstream British journal New Scientist made this comment on the mapping of the human genome: ‘For all our nascent genetic knowledge, if we end up with our genes owned by rich corporations and [with] a genetic underclass, will we really have advanced that much?’ (2000) There is no doubt that a significantly modified practice and image of science will be necessary for these new tasks of managing risks in the current social and political context. If nothing else, traditional science assumed that values were irrelevant to its work (this was, paradoxically, one of its great claims to value), and that uncertainties could be managed by technical routines (such as statistics). This was the methodological basis for the restriction of legitimacy to experts and researchers whose ‘normal science’ training had rigorously excluded such considerations. With uncertainty (extending even to salient ignorance) and value loading involved in any study of risk or safety, we are firmly in a period where a ‘post-normal’ conception of science is appropriate (Funtowicz and Ravetz 1993). The core of this new conception is ‘quality,’ since ‘truth’ is a luxury in contexts where, typically, facts are uncertain, values in dispute, stakes high, and decisions urgent. Under these circumstances, there is a need for an ‘extended peer community’ for the assurance of the quality of the scientific inputs, supplementing the existing communities of subject specialists and clients. And this new body of peers will have its own ‘extended facts.’ The materials for this new sort of science involve not only the operations of the natural world, but also the behavior of technical systems and the social systems of control. Therefore, information of new sorts becomes relevant, including local knowledge and commitments, investigative journalism, and published confidential reports. Although ‘post-normal science’ has its political aspects, this proposed extension of legitimacy of participation in a scientific process is not based on political objectives. It is a general conception of the appropriate methodological response to this new predicament of science, where the tasks of managing risks and ensuring safety cannot be left to the community of accredited scientific experts alone. This account of new forms of science would be incomplete without mention of a tendency which was thought to have been laid to rest long ago, as superstition unworthy of a civilized society. Coming in through East Asian practices of medicine and South Asian practices of consciousness, enriched cosmologies, involving ‘vibrations’ and ‘energies’ have become popular, even chic. The reduction of reality to its
Science, Social Organization of mathematical dimensions, which was the metaphysical core of the Scientific Revolution, is now being eroded in the practice of unconventional medicine. Although this has not yet touched the core of either research or the policy sciences, it is a presence, alien to our inherited scientific culture, which cannot be ignored. We are coming to the end of a period lasting several centuries, when in spite of all the major developments and internal divisions and conflicts, it made sense to speak of ‘science’ as an activity on which there was little disagreement on fundamentals. The very successes of that science have led to new challenges, from which significantly novel forms of science are emerging, characterized by equally novel forms of engagement with society. See also: Cultural Studies of Science; Foucault, Michel (1926–84); History of Science; History of Science: Constructivist Perspectives; Innovation, Theory of; Kuhn, Thomas S (1922–96); Polanyi, Karl (1886–1964); Popper, Karl Raimund (1902–94); Reflexivity, in Science and Technology Studies; Science, Sociology of; Scientific Knowledge, Sociology of; Social Science, the Idea of
Bibliography Allen J (ed.) 1970 March 4: Scientists, Students, and Society. MIT Press, Cambridge, MA Beck U 1992 Risk Society: Towards a New Modernity. Sage, London Bernal J D 1939 The Social Function of Science. Routledge, London Bronowski J 1961 Science and Human Values. Hutchinson, London Bush V 1945 Science: The Endless Frontier. Government Printing Office, Washington, DC Descartes R 1638 Discours de la MeT thode (6th part) Editorial 2000 New Scientist, July 1 Ellis N D 1969 The scientific worker. PhD thesis, University of Leeds Feyerabend P 1975 Against Method. New Left Books, London Fleck L 1979 Genesis and Deelopment of a Scientific Fact. Translation from German edn., 1935. University of Chicago Press, Chicago Funtowicz S, Ravetz J R 1993 Science for the post-normal age. Futures 25: 739–55 Gibbons M, Limoges C, Nowotny H, Schwartzman S, Scott P, Trow M 1994 The New Production of Knowledge. Sage, London Greenberg D 1967 The Politics of Pure Science. New American Library, New York Jasanoff S 1990 The Fifth Branch: Science Adisers as Policymakers. Harvard University Press, Cambridge, MA Kuhn T S 1962 The Structure of Scientific Reolutions. University of Chicago Press, Chicago Lakatos I 1970 History of science and its rational reconstructions. In: Lakatos I, Musgrave A (eds.) Criticism and the Growth of Knowledge. Cambridge University Press, Cambridge, UK
Latour B, Woolgar S 1977 Laboratory Life: The Social Construction of Scientific Facts. Sage, Beverley Hills, CA Merton R 1957 Social Theory and Social Structure. The Free Press, Glencoe, IL Polanyi M 1951 The Logic of Liberty. Routledge, London Popper K R 1963 Conjectures and Refutations. Routledge, London Popper K R 1970 Normal science and its dangers. In: Lakatos I, Musgrave A (eds.) Criticism and the Growth of Knowledge. Cambridge University Press, Cambridge, UK Price D J 1963 Little Science, Big Science. Cambridge University Press, Cambridge, UK Proctor R N 1991 Value-free Science? Harvard University Press, Cambridge, MA Raffensperger C 1999 Editorial: Scientists making a difference. The Networker: The Newsletter of the Science and Enironmental Health Network 4 Ravetz J R 1996 [1971] Scientific Knowledge and its Social Problems, new edn. Transaction Publishers, New Brunswick, NJ Rose S, Rose H 1976 The Political Economy of Science. Macmillan, London Salter L 1988 Mandated Science: Science and Scientists in the Making of Standards. Kluwer Academic Publishers, Dordrecht, Holland Sclove R E 1995 Democracy and Technology. The Guilford Press, New York Szilard L 1961 The Voice of the Dolphins and Other Stories. Simon & Schuster, New York Ziman J 1968 Public Knowledge. Cambridge University Press, Cambridge, UK
J. R. Ravetz
Science, Social Organization of The sociology of science has been divided since about 1980 between those contending that science gains sociological significance because of its organizational location and forms, and those arguing that it should be understood for its knowledge-building practices. The two groups have tended to treat social organization in completely different ways, and have consciously developed their ideas in opposition to the others. However, both have used some notion of the social organization of science to explain the constitution of facts about nature and the development of ways of reworking nature for strategic advantage. The tensions between the schools have been productive of new approaches to organizations and the operation of power. A scholar like Jasanoff, interested in political process and expertise, has used constructivist theories of scientific knowledge to show how central science has become to the legal system and regulatory structures (Jasanoff 1990). In contrast, a researcher like Bruno Latour, interested in the social struggles involved in making science ‘real,’ has demonstrated how the laboratory has become productive of powers that shape contemporary life (Latour 1993). This work, and much more like it, has begun to suggest 13687
Science, Social Organization of the fundamental ways that contemporary political systems and other major institutions depend on science and technology for their forms, operations, and legitimacy. It would have been hard for sociologists to entirely avoid the correlation between the growth of modern States and the so-called scientific revolution, and the questions it raises about power and control of the natural world. Courts in the fifteenth and sixteenth centuries used scientists to help design military technology, do political astrology, and make the earth into a showplace of power (Grafton 1999, Masters 1998). Even members of the Royal Society in the seventeenth century (often depicted as a politically independent organization for science) dedicated much effort to addressing public problems with their research (Webster 1976). Practical as well as conceptual arts like cartography gained importance in Italy, France, and England as a tool for trade and territorial control. Double entry bookkeeping provided a way of legitimating both commerce and governmental actions through the management of ‘facts’ (Poovey 1998). Botany and the plant trade, particularly in Spain, The Netherlands, and France, enriched the food supply and increased the stock of medicinal herbs. Italian engineers tried to tame rivers, the Dutch and French built canals (Masters 1998), the English deployed medical police to ensure public health (Carroll 1998), and the French, English, and Germans worked on forestry (Scott 1998). Military engineering throughout Europe was revolutionized with a combination of classical architectural principles, and new uses of cannon fire and other weapons (Mukerji 1997). As Patrick Carroll argues, technoscience was not a product of the twentieth century, but already part of the culture of science (the engine science) of the seventeenth century (Carroll 1998). The point of science and engineering in this period was human efficacy, the demonstration of human capacities to know the world and transform it for effect. One manifestation of this was the cultivation of personal genius and individual curiosity, yielding a dispassionate science, but another was the constitution of political territories that were engineered for economic development and political legibility, and managed (in part) to maintain the strength and health of the population (Mukerji 1997, Scott 1998).
1. Functionalist Foundations of the Sociology of Science The early sociologists of science, such as Merton and Ben-David, took for granted both historical and contemporary links between science and state power. They were steeped in the sociological literature on organizations that defined technology, at least, as central to the organization of major institutions— 13688
from the military to industry. Merton in his dissertation recognized the historical interest of courts and States in scientists and engineers, but traced the development of a disinterested culture of science, emanating from the places where thought was at least nominally insulated from political pollution: universities, scientific societies, and professional journals (Merton 1973, Crane 1972). Looking more directly at politics and science, Ben-David (1971), still interested in how social organization could promote excellence in science, considered differences in the organization of national science systems and their effects on thought. He assumed a kind of Mannheimian sociology of knowledge to argue that systems of research impact the progress of science. If location in the social system shaped what people could know, then the social organization of science was necessarily consequential for progress in science and engineering (Ben-David 1971). These approaches to the organization of science were grounded in their historical moment—the Cold War period. Science and engineering were essential to the power struggle of East and West. It was commonly held that WWII had been won through the successful effort to dominate science and technology. Policymakers in the US and Europe wanted to gain permanent political advantages for their countries by constructing a system of research in science and engineering that would continue to dominate both thought and uses of natural resources. Western ideology touted the freedom of thought allowed in the non-Communist world as the route to the future. According to the Mertonians, history seemed to support this posture. England with its Royal Society independent of the government was the one which produced Newton—not France and Italy with their systems of direct state patronage (Crane 1972, Merton 1973). The result was a clear victory for the open society, and reason to be confident about American science, which was being institutionalized inside universities rather than (in most cases) national laboratories.
2. The Sociology of Scientific Knowledge For all that the sociology of scientific knowledge (SSK) first presented itself as a radical subfield at odds with the functionalist tradition, it continued Merton’s impulse to associate science less with politics than philosophy (Barnes et al. 1996). Taking the laboratory (rather than the Department of Defense), as the center of calculation for science (and science studies) was an effective way to imagine that power was not at stake in science—even in the US. Moreover, SSK presented good reasons to look more closely at scientific knowledge. In the Mertonian tradition, sociologists had discussed the relationship of organizational structures to scientific progress—as though sociologists could
Science, Social Organization of and would know when science was flourishing. Sociologists of scientific knowledge were not prepared to make such judgments about scientific efficacy, nor passively willing to accept the word of scientists as informants on this. For SSK researchers, what was at stake was the philosophical problem of knowledge— how you know when an assertion about nature is a fact (Barnes et al. 1996, Collins 1985). This was precisely what Merton (1973) had declared outside the purview of sociology, but SSK proponents now transformed into a research question. How did insiders to the world of science make these kinds of determinations for themselves? The implications for studying the organization of science were profound. Laboratories were the sites where facts were made, so they were the organizational focus of inquiry (Knorr-Cetina 1981, Latour and Woolgar 1979, Lynch 1985). Making a scientific fact was, by these accounts, a collective accomplishment, requiring the coordination of the cognitive practices both within and across laboratories. Laboratories had their own social structures—some temporary and some more stable. In most cases laboratory members distinguished between principal investigators and laboratory technicians, who had different roles in the constitution of knowledge. The significance of research results depended on the social credibility of the researchers. Science was work for gentlemen whose word could be trusted—not tradesmen or women even if they did the work (Shapin and Schaffer 1985). There were alliances across laboratories forged with ideas and techniques that were shared by researchers and were dedicated to common problems or ways of solving them (Pickering 1984). To make scientific truths required more than just an experiment that would confirm or disconfirm an hypothesis. The results had to be witnessed and circulated within scientific communities to make the ‘facts’ known (Shapin and Schaffer 1985). Scientific paradigms needed proponents to defend and promote them. Creating a scientific fact was much like a military campaign; it required a high degree of coordination of both people and things. It was a matter of gaining the moral stature in the scientific community to have a scientist’s ideas taken as truthful. The tools accomplishing these ends were both social and cognitive (Shapin and Schaffer 1985). Members of these two schools (Mertonian and SSK) may have envisioned the task of articulating a sociology of science differently, but they shared a basic interest in socially patterned ways of thinking about, mobilizing, and describing nature (or the nature of things). Now that the dust has settled on their struggle for dominance in sociology the epistemological break assumed to exist between them has come to seem less profound. Power circulates through laboratories, and scientific experts circulate through the halls of power (Haraway 1989, Jasanoff 1990, 1994, Mukerji 1989). Ways of organizing research affect both the con-
stitution of knowledge and ways of life (Rabinow 1996). The world we know is defined and engineered through patterns of cognition that include manipulation of nature—in the laboratory and beyond.
3. The Politics of Knowledge-making and Knowledge Claims Contemporary research in the sociology of science has shed new light on organizations that sociologists thought they understood before, but never examined for their cognitive processes and relations to nature. Jasanoff’s work on the regulatory system in the US, for example, does not simply argue that concern about pollution and scientific regulation of other aspects of life has stimulated new research and yielded new information for policymakers—although that is true. She has shown how the legitimacy of the State has come to depend (to a surprising extent) on its claims to provide at least a minimal level of well-being for the population. Safe air, a healthy food supply, and good drinking water have become taken-for-granted aspects of political legitimacy that depend not only on manipulations of nature but also on the development of new strategies of reassurance. Regulators have not been able simply to ask scientists to use existing expertise to assess and ameliorate problems. They have had to cultivate sciences pertinent to the problems, and face the controversies about the results of new lines of research. The result is a new set of cognitive tools for science honed for policy purposes, and new pressures on political actors to understand and work with at least some of these measurements (Jasanoff 1990, 1994). Similarly, political legitimacy in rich countries also rests on the government’s ability to confront and address medical problems. As Steven Epstein pointed out in his study of AIDS research, the population now generally expects doctors, scientists, and policymakers to solve problems and keep the population healthy. Any fundamental disruption of this faith in expertise leads to anger and (in the US case he studied) political activism. Government officials (like public health workers) in these instances need not only to advocate and support research but also to make the government’s health system seem responsive to public needs. This means that the dispassionate pursuit of scientific truths cannot dictate the practice of research, or the dispersement of drugs. Research protocols cannot be entirely determined by experts, but require debate with the activists as well as professional politicians for whom the problem is an issue (Epstein 1996). Science is therefore used to manage public health (the body politic), and for creating a healthful environment for the citizenry. It is also used to design and manage infrastructures for the society as a whole. Computers are used to design roads, and manage toll systems. 13689
Science, Social Organization of They are employed by hospitals to define illnesses, codify medical practices, and determine courses of treatment for individuals. The military has not only used scientists and engineers (in Cold War style) to develop new weapons and create new modes for their delivery, but has mobilized these groups to develop national communication infrastructures—from road systems to airports to the Internet (Edwards 1996). This engineering is oddly an outgrowth of territorial politics—in this period when States are supposed (by theories from post-modernist to Marxist) to be dying due to globalization (Castells 1996). However, responsibility for health and well- being are circumscribed by national boundaries, and legal responsibility for them is kept within territorial boundaries and remains largely a problem for States.
4. Post-colonial Studies of Science and Technology The territorial dimensions of legitimacy and public health are particularly apparent in the burgeoning post-colonial literature on science, technology, and medicine, showing the export to poor countries of research involving dangerous materials or medical risks (Rafael 1995). The export of risk was perhaps most dramatically illustrated by the US use of an atoll in the Pacific Ocean for bomb tests, but there have been numerous less dramatic instances of this. Manufacturers using dangerous materials have been encouraged to set up factories in Third World countries where economic growth has been considered a higher priority than environmental safety (Jasanoff 1994). Medical researchers in need of subjects have frequently turned to colonial populations for them (Rafael 1995). These practices make it obvious that dangerous research and manufacture are not necessarily abandoned because of the risks. They are simply done in those less powerful places within a state’s sphere of influence where legitimacy is not at stake. In these instances, political regulation at home has not necessarily created a moral revolution in science, but a physical dissociation of practices and responsibility. Outside the social world of Western gentlemen, there has often been little concern on the part of researchers about their moral stature. Governments have not systematically worried about the normative consequences of technological development. Instead, poor people have been treated (like prisoners at home) as a disposable population of test subjects—just what AIDS patients refused to be (Epstein 1996, Jasanoff 1994, Rafael 1995). This pattern is both paralleled by, and connected to, the export of high technology into post-colonial areas of the world. The growth of manufacturing in Third World countries, using computerized production systems, and the growth of computing itself in both 13690
India and Africa for example, testify to another pattern of export of science and technology (Jasanoff 1994, Jules-Rosette 1990). These practices are usually presented as ways that corporations avoid local union rules, and deploy an educated workforce at low cost (Castells 1996). However, they are also ways large corporations have found to limit in-migration of labor from post-colonial regions, and to avoid political responsibility for the health of laboring people from poorer parts of the world. Less a case of exporting risk, it is a way of exporting responsibility for it (Jasanoff 1994).
5. Commercial Stakes in Scientific Thought It would be easy to think that while politics is driving some areas of research, there remains a core of pure science like the one desired by Merton. However, commercial as well as political forces work against this end (Martin 1991). Even the publishing system that was supposed to buffer science from the workings of power has turned out to be corruptible. It is not simply that scientific texts (the immutable mobiles of Latour and Woolgar (1979) have found political uses; texts have not been stabilized by being put in print. As Adrian Johns (1998) has shown, simply because printing technology could fix ideas and stabilize authorship has not meant that the publishing industry would use it this way. In seventeenth-century England—the heyday of the scientific revolution— publications were often pirated, changed for commercial purposes, or reattributed. Scientific authorship never did unambiguously extend empiricism and accurate beliefs about nature, or give scientific researchers appropriate recognition for their work. Publishing in science was just another part of the book trade, and was managed for profit. To this day, commercial pressures as well as peer review shapes the public record in science. The purpose of the science section in newspapers is to sell copies, not promote scientific truth. However, scientists still frequently use the popular press to promote their research and advance their careers in science (Epstein 1996). The practices of science and engineering themselves are not so clearly detached either. Commercial interests in biotechnology and computing have powerful effects on the organization of research, and relations between the university and industry (Rabinow 1996). Even though the Cold War is over, scientists still often cannot publish what they learn from DOD research (which includes work in computing). They rely on external funding, and so must study what is of interest to the government. They are under careful supervision of administrators when they export engineering practices from the laboratory into the world (Mukerji 1989). Dreams of democracy served by science and technology seem hard to sustain.
Science, Social Organization of
6. Beyond Power\Knowledge The broad range of contemporary studies of the politics of science and engineering do not just manifest a revival of interest in the political dimensions\connections of science and engineering in the post-Foucaultian world of power\knowledge, but they manifest new understandings of how power operates through multiple organizational forms. Scientists have (both historically and in the present) aided in the articulation of a system of power based on the principles familiar to the Frankfurt School—the domination of nature for the domination of people. Researchers in science studies who focus on the political mobilization of research for organizational advantage are now making clearer how strategic scientific management of the natural world works and does not work. It seems that governments and industry historically have not so much attained the ideas about nature they needed or paid for, but that scientists, in pursuing knowledge, have also produced means of dominating nature that have been used (and to some extent contained) by those institutions (Mukerji 1989). Modern states, in funding the cultivation of cognitive systems for learning about and managing natural resources, nuclear power, chemical pollutants, and viruses, have generated new patterns of domination but have also opened themselves up to new questions of legitimacy and, in some cases (Epstein 1996), to a redistribution of expertise. The system of scientific and engineering research is not just productive of ideas, but also transportation systems, research animals, laboratories themselves, and new technologies (like the Internet) (Edwards 1996, Kohler 1994, Rabinow 1996). The result is not just a brains trust of scientists, but an entire sociotechnical environment built for strategic effect (Cronon 1991). The cognitive systems of science and engineering are not just ways of coordinating thought through language to reach the truth, but ways of making the world again to reflect and carry human intelligence (and stupidity) about nature. See also: Academy and Society in the United States: Cultural Concerns; Disciplines, History of, in the Social Sciences; Human Sciences: History and Sociology; Kantian Ethics and Politics; Normative Aspects of Social and Behavioral Science; Paradigms in the Social Sciences; Research and Development in Organizations; Scientific Academies in Asia; Scientific Knowledge, Sociology of; Truth and Credibility: Science and the Social Study of Science; Universities, in the History of the Social Sciences
Bibliography Barnes B, Bloor D, Henry J 1996 Scientific Knowledge. Chicago University Press, Chicago Ben-David J 1971 The Scientist’s Role in Society. Prentice-Hall, Englewood Cliffs, NJ
Bowker G, Star S L 1999 Sorting Things Out: Classification and its Consequences. MIT Press, Cambridge, MA Carroll P 1998 Ireland: Material Construction of the Technoscientific State. Dissertation, University of California, San Diego, CA Castells M 1996 The Rise of the Network Society. Blackwell, London Collins H 1985 Changing Order. Sage, Beverly Hills, CA Crane D 1972 Inisible Colleges. University of Chicago Press, Chicago Cronon W 1991 Nature’s Metropolis. Norton, New York Edwards P N 1996 The Closed World. MIT Press, Cambridge, MA Epstein S 1996 Impure Science. University of California Press, Berkeley, CA Grafton A 1999 Cardano’s Cosmos. Harvard University Press, Cambridge, MA Habermas J 1975 Legitimation Crisis. Beacon Press, Boston Haraway D 1989 Primate Visions. Routledge, New York Jasanoff S 1990 The Fifth Branch. Harvard University Press, Cambridge, MA Jasanoff S 1994 Learning from Disaster. University of Pennsylvania Press, Philadelphia, PA Johns A 1998 The Nature of the Book. University of Chicago Press, Chicago Jules-Rosette B 1990 Terminal Signs: Computers and Social Change in Africa. Mouton de Gruyter, New York Knorr-Cetina K 1981 The Manufacture of Knowledge. Pergamon Press, New York Kohler R 1994 Lords of the Fly. University of Chicago Press, Chicago Latour B 1993 We Hae Neer Been Modern. Harvard University Press, Cambridge, MA Latour B, Woolgar S 1979 Laboratory Life. Sage Publications, Beverly Hills, CA Lynch M 1985 Art and Artifact in Laboratory Science. Routledge and Kegan Paul Martin B 1991 Scientific Knowledge in Controersy. State University of New York Press, Albany, NY Masters R 1998 Fortune is a Rier. Free Press, New York Merton R K 1973 The Sociology of Science. University of Chicago Press, Chicago Mukerji C 1989 A Fragile Power. Princeton University Press, Princeton, NJ Mukerji C 1997 Territorial Ambitions and the Gardens of Versailles. Cambridge University Press, New York Pickering A 1984 Constructing Quarks. University of Chicago Press, Chicago Poovey M 1998 A History of the Modern Fact. University of Chicago Press, Chicago Proctor R 1991 Value-free Science? Harvard University Press, Cambridge, MA Rabinow P 1996 The Making of PCR. University of Chicago Press, Chicago Rafael V 1995 Discrepant Histories. Temple University Press, Philadelphia, PA Scott J C 1998 Seeing Like a State. Yale University Press, New Haven, CT Shapin S, Schaffer S 1985 Leiathan and the Air-pump. Princeton University Press, Princeton, NJ Webster C 1976 The Great Instauration. Holmes and Meier, New York
C. Mukerji Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
13691
ISBN: 0-08-043076-7
Science, Sociology of
Science, Sociology of To make science the object of sociological analysis directs attention to the production and consumption of scientific knowledge in diverse cultural contexts, institutional structures, local organizations, and immediate settings. The sociology of science divides into three broad lines of inquiry, each distinguished by a particular mix of theories and methods. The earliest systematic studies (mostly from the 1950s to the early 1970s) focus on the structural contexts of scientists’ behavior: what rules govern the pursuit of scientific knowledge, how are scientists judged and rewarded, how is scientific research broken up into dense networks of specialists? In the 1980s, sociologists shift their attention to the practices through which scientific knowledge is constructed—at the laboratory bench or in the rhetoric of professional papers. Starting in the 1990s, science is put in more encompassing societal contexts, as sociologists examine scientists as purveyors of cognitive authority, and explore their linkages to power, politics, and the economy.
egories (such as the division of a society by family or gender). However, as human societies grew in size and as their institutions became functionally differentiated, a distinctively scientific pursuit of knowledge was gradually insulated from such social causes. The observable facts of modern science, Durkheim concluded, were in accord with the reality of the physical world—a position that forestalled examinations of how observable facts are also shaped by the culture and communities in which they arise. Karl Marx’s materialism would seem to commit him to the idea that all beliefs arise amid historically specific conditions of production, as they are shaped by the goals and interests of a ruling class. The rise of science in seventeenth-century Europe is intimately bound with the rise of industrial capitalism and, for Marx, can be explained in terms of the utilities of science-based technologies for improving productivity and enlarging surplus value. But although the rate of scientific growth may be explained by its congruence with the interests of the bourgeoisie, Marx seems to suggest that the content of scientific claims inside professionalized research networks is nonideological—that is, an objective account of natural reality.
1. Precursors It is remarkable how much the literature in sociology of science is bunched into the last third of the twentieth century. Perhaps only after the deployment of nuclear weapons, or only after genetic engineering raised eugenic nightmares, could sociologists begin to think about science as a social problem rather than as a consistent solution; or maybe earlier generations of sociologists were guided by epistemological assumptions that rendered true scientific knowledge immune from social causes—thus putting it outside the orbit of sociological explanation.
1.1 Classical Anticipations ‘Science’ is nowhere indexed in Max Weber’s encyclopedic Economy and Society, a measure of his unwillingness or inability to see it as a consequential factor in human behavior or social change. Weber’s interest in science was largely methodological and political. Could the causal models employed so effectively in the natural sciences be used as well to study social action? Does the objectivity and neutrality of the social scientist preclude involvement in political activity? Emile Durkheim also sought to institutionalize sociology by making its methods appear scientifically precise, but at the same time considered scientific knowledge as an object of sociological study. Durkheim suggested that basic categories of thought and logic (such as time and space) are social in origin, in that they correspond to fundamental social cat13692
1.2 Science in the Sociology of Knowledge Even more surprising is the failure of systematic sociological studies of science to emerge from a blossoming sociology of knowledge in the 1920s and 1930s. Neither Max Scheler nor Karl Mannheim, authors of foundational treatises on the social determinants of knowledge, inspired sustained inquiry into the social determinants of science—probably because both distinguished scientific knowledge from other kinds in a way that truncated what sociology could say about it. Scheler isolated the content of scientific knowledge—and the criteria for ascertaining validity—by describing these as absolute and timeless essences, not shaped by social interests. The effects of social structure (specifically, the power of ruling elites) is limited to selections of problems and beliefs from that self-contained and essential realm of ideas. Mannheim sustained the neo-Kantian distinction between formal knowledge of the exact sciences and socio-historical knowledge of culture. Phenomena of the natural world are invariant, Mannheim suggests, and so therefore are criteria for deciding truth (i.e., impartial observations based on accurate measurements). In contrast, cultural phenomena become meaningful only as they are constructed through interest-laden judgments of significance, which are neither impartial nor invariant, and thus they are amenable to sociological explanation. Robert K. Merton’s 1938 classic Science, Technology and Society in Seenteenth-century England (see Merton 1973) tackles a fundamental problem:
Science, Sociology of why did modern science emerge with a flourish in seventeenth-century England? His answer has become known as the ‘Merton Thesis:’ an ethos of Puritanism that provided both the motivating force and legitimating authority for the pursuit of scientific inquiry. Certain religious values—e.g., God is glorified by an appreciation of his handiwork in Nature, or Blessed Reason separates human from beast—created a cultural context fertile for the rise of science. Merton also explains shifts in the foci of research attention among the early modern ‘natural philosophers’ by connecting empirical inquiry to the search for technological solutions to practical problems in mining, navigation, and ballistics.
2. Social Organization of the Scientific Community When concerted sociological studies of science began in the late 1950s and 1960s, research centered on the institutions or social structures of science—with relatively less attention given to the routine practices involved in making knowledge or to the wider settings in which science was conducted. This work was largely inspired by theories of structural-functional analysis, which ask how the community of scientists is organized in order to satisfy modern society’s need for certified, reliable knowledge. One distinctive feature of this first phase is a reliance on quantitative methods of analysis. With statistical data drawn from surveys of scientists and from the Science Citation Index (and other bibliometric sources), sociologists developed causal models to explain individual variations in research productivity and used topographical techniques such as multidimensional scaling to map the dense networks of scientists working at a research front.
2.1
Norms of Science
The shift from analyzing science in society to analyzing its internal social organization was effected in Merton’s 1942 paper on the normative structure of science (in Merton 1973). Written under the shadow of Nazism, Merton argues that the success of scientists in extending certified knowledge depends, at once, on a salutary political context (namely democracy, which allows science a measure of autonomy from political intrusions and whose values are said to be congruent with those of science—quite unlike fascism) and on an internal institutionalized ethos of values held to be binding upon the behavior of scientists. This ethos comprised the famous norms of science: scientists should evaluate claims impersonally (universalism), share all findings (communism), never sacrifice truth for personal gain (disinterestedness) and always question authority (organized skepticism). Behavior con-
sonant with these moral expectations is functional for the growth of reliable knowledge, and for that reason young scientists learn through precept how they are expected to behave, conformity is rewarded, and transgressions are met with outrage. Subsequent work ignored Merton’s conjectures about science and democracy, as sociologists instead pursued implications of the four social norms. Studies of behavioral departures from these norms—ethnocentrism, secrecy, fraud, plagiarism, dogmatism— precipitated debates over whether such deviance is best explained by idiosyncratic characteristics of a few bad apples or changing structural circumstances (such as commercialization of research) that might trigger increases in such behavior. Sociologists continue to debate the possibility that Merton’s norms are better explained as useful ideological justifications of scientists’ autonomy and cognitive authority. Other research suggests that the norms guiding scientific conduct vary historically, vary among disciplines, vary among organizational contexts (university research vs. military or corporate research), and vary even in their situational interpretation, negotiation and deployment—raising questions about whether the norms identified by Merton are functionally necessary for enlarging scientific knowledge.
2.2 Stratification and Scientific Careers The norm of universalism in particular has elicited much empirical research, perhaps because it raises questions of generic sociological interest: how is scientific performance judged, and how are inequalities in the allocation of rewards and resources best described and explained? With effective quantitative measures of individual productivity (number of publications or citations to one’s work), resources (grant dollars), and rewards (prizes, like the Nobel), sociologists have examined with considerable precision the determinants of individual career success or failure. Competition among scientists is intense, and the extent of inequality high: the distribution of resources and rewards in science is highly skewed. A small proportion of scientists publish most research papers (and those papers collect most citations), compete successfully for research grants and prestigious teaching posts, achieve international visibility and recognition, and win cherished prizes. Debate centers on whether these observed inequalities in the reward system of science are compatible with the norm of universalism—which demands that contributions to knowledge be judged on their scientific merit, with resources and opportunities meted out in accordance with those judgments. The apparent elitism of science may result from an ‘accumulation of advantage’: work by relatively more eminent or well-positioned scientists is simply noticed 13693
Science, Sociology of more and thus tends to receive disproportional credit—which (over time) enlarges the gap between the few very successful scientists and everybody else. Such a process may still be universalistic because it is functional for the institutional goal of science: giving greater attention to research of those with accomplished track-records may be an efficient triage of the overwhelming number of new candidate theories or findings. Others suggest that particularism contributes to the stratification of scientists—old boy networks that protect turf and career reputations by rewarding sycophants. The underrepresentation of women in the higher echelons of science has called attention to sometimes subtle sexism that occurs early in the scientific career (restricted access to well-connected mentors, or essential research equipment, or opportunities to collaborate and assignment to trivial problems or mind-numbing tasks).
2.3 Institutionalization of the Scientific Role A separate line of sociological inquiry (exemplified in work by Joseph Ben-David 1991 and Edward Shils) seeks an explanation for how science first became a remunerable occupation—and later, a profession. How did the role of the scientist emerge from historically antecedent patterns of amateurs who explored nature part-time and generally at their own expense? The arrival of the ‘scientist’ as an occupational selfidentification with distinctive obligations and prerogatives is inseparable from the institutionalization of the modern university (itself dependent upon government patronage). Universities provided the organizational form in which the practice of science could become a full-time career—by fusing research with teaching, by allowing (ironically) the historic prestige of universities as centers of theology and scholasticism to valorize the new science, and by providing a bureaucratic means of paying wages and advancing careers. The scientific role has also been institutionalized in corporate and government labs. The difficulties of transporting a ‘pure science’ ideal of university-based research into these very different organizational settings have been the object of considerable sociological attention. Scientists in industry or government face a variety of competing demands: their research is often directed to projects linked to potential profits or policy issues rather than steered by the agenda of their discipline or specialty; the need to maintain trade secrets or national security hampers the ability of scientists in these settings to publicize their work and receive recognition for it. And, as Jerome Ravetz (1971) suggests, the intrusion of ‘bureaucratic rationality’ into corporate and state science compromises the craft character of scientific work: largely implicit understandings and skills shared by the community of scientists and vital for the sustained accumulation of 13694
scientific wisdom have little place in accountabilities driven by the bottom-line or policy-relevance.
2.4 Disciplines and Specialties Sociologists use a variety of empirical indicators to measure the social and cognitive connections among scientists: self-reports of those with whom a scientist exchanges ideas or preprints, subject-classifications of publications in topical indexes or abstract journals, lineages of mentor–student relationshships or collaborations, patterns of who cites whom or is cited with whom (‘co-citation’). The networks formed by such linkages show occasional dense clusters of small numbers of scientists whose informal communications are frequent, who typically cite each other’s very recent papers, and whose research focusses on some new theory, innovative method, or breakthrough problem. Emergence of these clusters—for example, the birth of radio astronomy in England after WWII, as described by David Edge and Michael Mulkay (1976)—is a signal that science has changed, both cognitively and socially: new beliefs and practices are ensconced in new centers for training or research with different leaders and rafts of graduate students. Over time, these specialties evolve in a patterned way: the number of scientists in the network becomes much larger and connections among them more diffuse, the field gets institutionalized with the creation of its own journals and professional associations, shattering innovations become less common as scientists work more on filling in details or adding precision to the now-aging research framework. As one specialty matures, another dense cluster of scientists emerges elsewhere, as the research front or cutting edge moves on.
3. Sociology of Scientific Knowledge A sea-change in sociological studies of science began in the 1970s with a growing awareness that studies of the institutional and organizational contexts shaping scientists’ behavior could not illuminate sufficiently the processes that make science science: experimental tinkering, sifting of evidence, negotiation of claims, replacement of old beliefs about nature with new ones, achievement of consensus over the truth. All of these processes—observation, getting instruments and research materials (e.g., mice) to work, logic, criteria for justifying a finding as worthy of assent, choices among theories, putting arguments into words or pictures, persuading other scientists that you are correct—are uncompromisingly social, cultural, and historical phenomena, and so sociologists set about to explain and interpret the content of scientific knowledge by studying the routine practices of scientific work.
Science, Sociology of This research is guided by constructivist theories (and, less often, ethnomethodology), which center attention on the practically accomplished character of social life. Rather than allow a priori nature or given social structures to explain behavior or belief, constructivist sociologists examine how actors incessantly make and remake the structural conditions in which they work. Such research relies methodologically on historical case studies of scientific debate, up-close ethnographic observations of scientific practices, and on interpretative analysis of scientific texts.
3.1 Sociology of Discoery Diverse studies of scientific discovery illustrate the range of sociological perspectives brought to bear on these consequential events. An early line of inquiry focusses on the social and cognitive contexts that cause the timing and placing of discoveries: why did these scientists achieve a breakthrough then and there? Historical evidence points to a pattern of simultaneous, multiple and independent discoveries—that is, it is rare for a discovery to be made by a scientist (or a local team) who are the only ones in the world doing research on that specific question. Because honor and recognition are greatest for solutions to the perceivedly ‘hottest’ problems in a discipline, the best scientists are encouraged by the reward system of science to tackle similar lines of research. But these same social structures can also forestall discovery or engender resistance to novel claims. Cognitive commitments to a long-established way of seeing the natural world (reinforced by reputations and resources dependent upon those traditional perspectives) can make it difficult for scientists to see the worthiness of a new paradigm. Resistance to new ideas seems to be greatest among older scientists, and in cases where the proposed discovery comes from scientists with little visibility or stature within the specialty or discipline that would be transformed. More recent sociological research considers the very idea of ‘discovery’ as a practical accomplishment of scientists. Studies inspired by ethnomethodology offer detailed descriptions of scientific work ‘first-timethrough,’ taking note of how scientists at the lab bench decide whether a particular observation (among the myriad observations) constitutes a discovery. Other sociologists locate the ‘moment’ of discovery in downstream interpretative work, as scientists narrate firsttime-through research work with labels such as ‘breakthrough.’ Such discovery accounts are often sites of dissensus, as scientists dispute the timing or implications of an alleged discovery amid ongoing judgments of its significance for subsequent research initiatives or allocations of resources. These themes—interests, changing beliefs, ordinary scientific work, post-hoc accountings, dissent, persuasion—
have become hallmarks of the sociology of scientific knowledge.
3.2 Interests and Knowledge-change In the mid-1970s to the 1980s, sociology of science took root at the Science Studies Unit in Edinburgh, as philosopher David Bloor developed the ‘strong programme,’ while Barry Barnes, Steven Shapin, Donald MacKenzie, and Andrew Pickering developed its sociological analog—the ‘interest model.’ The goal is to provide causal explanations for changes in knowledge—say, the shift from one scientific understanding of nature to a different one. Scientists themselves might account for such changes in terms of greater evidence, coherence, robustness, promise, parsimony, predictive power, or utility of the new framework. Sociologists, in turn, account for those judgments in terms of social interests of scientists that are either extended or compromised by a decision to shift to the new perspective. What becomes knowledge is thus contingent upon the criteria used by a particular community of inquirers to judge competing understandings of nature, and also upon the goals and interests that shape their interpretation and deployment of those criteria. Several caveats are noted: interests are not connected to social positions (class, for example, or nationality, discipline, specialty) in a rigidly deterministic way; social interests may change along with changes in knowledge; choices among candidate knowledge-claims are not merely strategic— that is, calculations of material or symbolic gains are bounded by considerable uncertainty and by a shared culture of inquiry that provides standards for logical or evidential adequacy and for the proper use of an apparatus or concept. Drawing on historical case studies of theoretical disputes in science—nineteenth-century debates over phrenology and statistical theory, twentieth-century debates among high-energy physicists over quarks— two different kinds of interests are causally connected to knowledge-change. Political or ideological commitments can shape scientists’ judgments about candidate knowledge claims: the development of statistical theories of correlation and regression by Francis Galton, Karl Pearson and R. A. Fisher depended vitally on the utility of such measures for eugenic objectives. Different social interests arise from the accumulated expertise in working with certain instruments or procedures, which incline scientists to prefer theories or models that allow them to capitalize on those skills.
3.3 Laboratory Practices and Scientific Discourse As sociologists moved ever closer to the actual processes of ‘doing science,’ their research divided into 13695
Science, Sociology of two lines of inquiry: some went directly to the laboratory bench seeking ethnographic observations of scientists’ practices in situ; others examined scientists’ discourse in talk and texts—that is, their accounting practices. These studies together point to an inescapable conclusion: there is nothing not-social about science. From the step-by-step procedures of an experiment to writing up discovered facts for journal publication, what scientists do is describable and explicable only as social action—meaningful choices contingent on technical, cognitive, cultural, and material circumstances that are immediate, transient, and largely of the scientists’ own making. Laboratory ethnographies by Karin Knorr-Cetina (1999), Michael Lynch (1993), and Bruno Latour and Steve Woolgar (1986) reveal a science whose order is not to be found in transcendent timeless rules of ‘scientific method’ or ‘good lab procedures,’ but in the circumstantial, pragmatic, revisable, and iterative choices and projects that constitute scientific work. These naturalistic studies emphasize the local character of scientific practice, the idea that knowledgemaking is a process situated in particular places with only these pieces of equipment or research materials or colleagues immediately available. Never sure about how things will turn out in the end, scientists incessantly revise the tasks at hand as they try to get machines to perform properly, control wild nature, interpret results, placate doubting collaborators, and rationalize failures. Even methodical procedures widely assumed to be responsible for the objective, definitive, and impersonal character of scientific claims—experimental replication, for instance—are found to be shot-through with negotiated, often implicit, and potentially endless judgments about the competence of other experimentalists and the fidelity of their replication-attempts to the original (as Harry Collins (1992) has suggested). Ethnographic studies of how scientists construct knowledge in laboratories compelled sociologists then to figure out how the outcomes of those mundane contextual practices (hard facts, established theories) could paradoxically appear so unconstructed—as if they were given in nature all along, and now just found (not made). Attention turned to the succession of ‘inscriptions’ through which observations become knowledge—from machine-output to lab notebook to draft manuscript to published report. Scientists’ sequenced accounts of their fact-making rhetorically erase the messy indeterminacy and opportunism that sociologists have observed at the lab bench, and substitute a story of logic, method, and inevitability in which nature is externalized as waiting to be discovered. Such studies of scientific discourse have opened up an enduring debate among constructivist sociologists of science: those seeking causal explanations for scientists’ beliefs treat interests as definitively describable by the analyst, while others (Gilbert and Mulkay 1984, Woolgar 1988) suggest that socio13696
logists must respect the diversity of participants’ discursive accounts of their interests, actions, or beliefs—and thus treat actors’ talk and text not as mediating the phenomena of study but as constituting them.
3.4 Actor-networks and Social Worlds After sociologists worked to show how science is a thoroughly social thing, Bruno Latour (1988) and Michel Callon (1986) then retrieve and reinsert the material: science is not only about facts, theories, interests, rhetoric, and power but also about nature and machines. Scientists accomplish facts and theories by building ‘heterogenous networks’ consisting of experimental devices, research materials, images and descriptive statistics, abstract concepts and theories, the findings of other scientists, persuasive texts—and, importantly, none of these are reducible to any one of them, nor to social interests. Things, machines, humans, and interests are, in the practices of scientists, unendingly interdefined in and through these networks. They take on meanings via their linkages to other ‘actants’ (a semiotic term for anything that has ‘force’ or consequence, regardless of substance or form). In reporting their results, scientists buttress claims by connecting them to as many different actants as they can, in hopes of defending the putative fact or theory against the assault of real or potential dissenters. From this perspective, length makes strength, that is, the more allies enrolled and aligned into a network—especially if that network is then stabilized or ‘black boxed’—the less likely it is that dissenters will succeed in disentangling the actants and thereby weaken or kill the claim. Importantly for this sociology of science, the human and the social are decentered, in an ontology that also ascribes agency to objects of nature or experimental apparatuses. Actor-network theory moved the sociological study of science back outside the laboratory and professional journal—or, rather, reframed the very idea of inside and outside. Scientists and their allies ‘change the world’ in the course of making secure their claims about nature, and in the same manner. In Latourian vernacular, not only are other scientists, bits of nature or empirical data enlisted and regimented, but also political bodies, protest movements, the media, laws and hoi polloi. When Louis Pasteur transformed French society by linking together microbes, anthrax, microscopes, laboratories, sick livestock, angry farmers, nervous Parisian milk-drinkers, public health officials, lawmakers, and journalists into what becomes a ‘momentous discovery,’ the boundary between science and the rest of society is impossible to locate. Scientists are able to work autonomously at their benches precisely because so many others outside the lab are also ‘doing science,’ providing the life
Science, Sociology of support (money, epistemic acquiescence) on which science depends. The boundaries of science also emerge as theoretically interesting in related studies derived from the brand of symbolic interactionism developed by Everett Hughes, Herbert Blumer, Anselm Strauss, and Howard Becker (and extended into research on science by Adele Clarke 1990, Joan Fujimura, and Susan Leigh Star). On this score, science is work—and, instructively, not unlike work of any other kind. Scientists (like plumbers) pursue doable problems, where ‘doability’ involves the articulation of tasks across various levels of work organization: the experiment (disciplining research subjects), the laboratory (dividing labor among lab technicians, grad students, and postdocs), and ‘social worlds’ (the wider discipline, funding agencies, or maybe animal-rights activists). Scientific problems become increasingly doable if ‘boundary objects’ allow for cooperative intersections of those working on discrete projects in different social worlds. For example, success in building California’s Museum of Vertebrate Zoology in the early twentieth century depended upon the standardization of collection and preparation practices (here, the specimens themselves become boundary objects) that enabled biologists to align their work with trappers, farmers, and amateur naturalists in different social worlds. As in actor-network theory, sociologists working in the ‘social worlds’ tradition make no assumption about where science leaves off and the rest of society begins—those boundaries get settled only provisionally, and remain open to challenge from those inside and out.
4. Science as Cultural Authority It is less easy to discern exactly what the sociology of science is just now, and where it is headed. Much research centers on the position of science, scientists, and scientific knowledge in the wider society and culture. Science is often examined as a cognitive or epistemic authority; scientists are said to have the legitimate power to define facts and assess claims to truth. This authority is not treated as an inevitable result of the character or virtue of those who become scientists, the institutional organization of science (norms, for example) or of the ‘scientific method.’ It is, rather, an accomplished resource pursued strategically by a profession committed not only to extending knowledge but also to the preservation and expansion of its power, patronage, prestige, and autonomy. No single theoretical orientation or methodological program now prevails. Constructivism remains appealing as a means to render contingent and negotiable (rather than ‘essential’) those features of scientific practice said to justify its epistemic authority. But as the agenda in the sociology of science shifts from
epistemological issues (how is knowledge made?) to political issues (whose knowledge counts, and for what purposes?), constructivism has yielded to a variety of critical theories (Marxism, feminism and postmodernism) that connect science to structures of domination, hierarchy, and hegemony. A popular research site among sociologists of science is the set of occasions where scientists find their authority challenged by those whose claims to knowledge lack institutional legitimacy.
4.1 Credibility and Expertise Steven Shapin (1994) (among others) has identified credibility as a constitutive problem for the sociology of science. Whose knowledge-claims are accepted as believable, trustworthy, true or reliably useful—and on what grounds? Plainly, contingent judgments of the validity of claims depend upon judgments of the credibility of the claimants—which has focussed sociological attention on how people use (as they define) qualities such as objectivity, expertise, competence, personal familiarity, propriety, and sincerity to decide which candidate universe becomes provisionally ‘real.’ A long-established line of sociological research examines those public controversies that hinge, in part, on ‘technical’ issues. Case studies of disputes over environmental and health risks find a profound ambivalence: the desire for public policy to be decided by appropriate legislative or judicial bodies in a way that is both understandable and accountable to the populace runs up against the need for expert skills and knowledge monopolized by scientific, medical or engineering professionals. Especially when interested publics are mobilized, such disputes often become ‘credibility struggles’ (as Steven Epstein (1998) calls them). In his study of AIDS politics, Epstein traces out a shift from activists’ denunciation of scientists doing research on the disease to their gaining a ‘seat at the table’ by learning enough about clinical trials and drug development to participate alongside scientists in policy decisions. In this controversy, as in many others, the cultural boundaries of science are redrawn to assign (or, alternatively, to deny) epistemic authority to scientists, would-be scientists, citizens, legislators, jurists, and journalists.
4.2 Critique of Science Recent sociological studies have themselves blurred the boundaries between social science and politics by examining the diverse costs and benefits of science. Whose agenda does science serve—its own? global capital? political and military elites? colonialism? patriarchy? the Earth’s? Studies of molecular biology and biotechnology show how the topics chosen for 13697
Science, Sociology of scientific research—and the pace at which they are pursued—are driven by corporate ambitions for patents, profits, and market-share. Related studies of the Green Revolution in agricultural research connect science to imperialist efforts to replace indigenous practices in less developed countries with ‘advanced technologies’ more consonant with the demands of global food markets. Feminist researchers are equally interested in the kinds of knowledge that science brings into being—and, even more, the potential knowledges not sought or valorized. In the nineteenth century, when social and natural science offered logic and evidence to legitimate patriarchal structures, other styles of inquiry and learning practiced among women (Parisian salons, home economics, midwifery, and cookery) are denounced as unscientific and, thus, suspect. Other feminists challenge the hegemony of scientific method, as a way of knowing incapable of seeing its own inevitable situatedness and partiality; some suggest that women’s position in a genderstratified society offers distinctive epistemic resources that enable fuller and richer understandings of nature and culture. These critical studies share an interest in exposing another side of science: its historical complicity with projects judged to be inimical to goals of equality, human rights, participatory democracy, community and sustainable ecologies. They seek to fashion a restructured ‘science’ (or some successor knowledgemaker) that would be more inclusive in its practitioners, more diverse in its methods, and less tightly coupled to power. See also: Actor Network Theory; Cultural Studies of Science; Laboratory Studies: Historical Perspectives; Norms in Science; Science and Technology Studies: Ethnomethodology; Science, Social Organization of; Scientific Controversies; Scientific Culture; Scientific Knowledge, Sociology of; Strong Program, in Sociology of Scientific Knowledge; Truth and Credibility: Science and the Social Study of Science
Bibliography Barber B, Hirsch W (eds.) 1962 The Sociology of Science. Free Press, Glencoe, IL Barnes B, Bloor D, Henry J 1996 Scientific Knowledge: A Sociological Analysis. University of Chicago Press, Chicago Barnes B, Edge D (eds.) 1982 Science in Context. MIT Press, Cambridge, MA Ben-David J 1991 Scientific Growth. University of California Press, Berkeley, CA Callon M 1986 Some elements of a sociology of translation: Domestication of the scallops and the fishermen of St Bliuex Bay. In: Law J (ed.) Power, Action and Belief. Routledge & Kegan Paul, London Clarke A E 1990 A social worlds research adventure. In: Cozzens S E, Gieryn T F (eds.) Theories of Science in Society. Indiana University Press, Bloomington, IN
13698
Clarke A E, Fujimura J H (eds.) 1992 The Right Tools for the Job: At Work in Twentieth-century Life Sciences. Princeton University Press, Princeton, NJ Collins H M 1992 Changing Order: Replication and Induction in Scientific Practice. University of Chicago Press, Chicago Edge D O, Mulkay M J 1976 Astronomy Transformed. Wiley, New York Epstein S 1998 Impure Science: AIDS, Actiism, and the Politics of Knowledge. University of California Press, Berkeley, CA Gieryn T F 1999 Cultural Boundaries of Science: Credibility on the Line. University of Chicago Press, Chicago Gilbert G N, Mulkay M 1984 Opening Pandora’s Box: A Sociological Analysis of Scientists’ Discourse. Cambridge University Press, Cambridge, UK Hagstrom W O 1965 The Scientific Community. Basic Books, New York Harding S 1986 The Science Question in Feminism. Cornell University Press, Ithaca, NY Jasanoff S, Markle G E, Petersen J C, Pinch T (eds.) 1995 Handbook of Science and Technology Studies. Sage, Thousand Oaks, CA Kloppenburg J R Jr. 1988 First the Seed: The Political Economy of Plant Biotechnology 1492–2000. Cambridge University Press, Cambridge, UK Knorr-Cetina K 1999 Epistemic Cultures. Harvard University Press, Cambridge, MA Knorr-Cetina K, Mulkay M J (eds.) 1983 Science Obsered. Sage, Beverly Hills, CA Latour B 1988 Science in Action. Harvard University Press, Cambridge, MA Latour B, Woolgar S 1986 Laboratory Life: The Construction of Scientific Facts. Princeton University Press, Princeton, NJ Long J S, Fox M F 1995 Scientific careers: Universalism and particularism. Annual Reiew of Sociology 21: 45–71 Lynch M 1993 Scientific Practice and Ordinary Action. Cambridge University Press, Cambridge, UK Merton R K 1973 The Sociology of Science. University of Chicago Press, Chicago Mulkay M J 1979 Science and the Sociology of Knowledge. George Allen & Unwin, London Mulkay M J 1980 Sociology of science in the West. Current Sociology 28: 1–184 Nelkin D (ed.) 1984 Controersy: The Politics of Technical Decisions. Sage, Beverly Hills, CA Ravetz J R 1971 Scientific Knowledge and its Social Problems. Oxford University Press, Oxford, UK Shapin S 1994 A Social History of Truth. University of Chicago Press, Chicago Shapin S 1995 Here and everywhere: Sociology of scientific knowledge. Annual Reiew of Sociology 21: 289–321 Woolgar S 1988 Science: The Very Idea. Tavistock, London Zuckerman H 1988 The sociology of science. In: Smelser N (ed.) Handbook of Sociology. Sage, Newbury Park, CA
T. F. Gieryn
Science, Technology, and the Military Technology has changed warfare since time immemorial. The invention of gunpowder, artillery, and rifles revolutionized warfare. Individual scientists, for cen-
Science, Technology, and the Military turies, have advised the military on specific problems: Archimedes reportedly helped the tyrant of Syracuse in devising new weaponry against the Romans in 212 BC; Leonardo da Vinci supplied us with a variety of drawings of new armaments; and, since the emergence of ‘modern’ science in the sixteenth and seventeenth centuries, many prominent scientists, including Tartaglia, Galileo, Newton, Descartes, Bernouilli, and Euler, have devoted some of their time and intellect to helping solve military problems. This article first argues that World War II and the subsequent Cold War produced a dramatic change in the way scientists became involved in the weapons innovation process. Next, it shows that concerns about the resulting ‘arms race’ brought about a new type of studies—defense technology assessment studies—that dealt with the impact of new weapons systems on national and international security. Many of the newly developed weapons were perceived to have a negative impact, which raised the question of whether and how the weapons innovation process could be influenced. The article discusses a variety of analytical approaches aimed at understanding the dynamics of the weapons innovation process. It argues that a sociotechnical network approach is the most promising one to provide valuable insights for influencing this innovation process. This approach also provides a suitable framework for investigating the relationship between civil and military technological innovation, a subject of growing interest that is discussed in the final section.
1. Weapons Innoation Becomes Organized World War II produced a dramatic change in the way scientists became involved in military matters. Science and scientists in great numbers were mobilized for weapons innovation in a highly organized and concentrated effort. In the USA these scientists contributed, mainly under the auspices of the newly established Office of Scientific Research and Development, to the development of a variety of new technologies, including the atomic bomb, radar, the proximity fuse, and also penicillin. The decisive contribution of scientists to these war efforts implied a fundamental shift in the role of science and technology in future military affairs. Immediately after the war, US science policy pioneer Vannevar Bush (1945), drawing on the war experience, advised the President: [t]here must be more—and more adequate—military research in peacetime. It is essential that the civilian scientists continue in peacetime some portion of those contributions to national security which they have made so effectively during the war.
His advice stood in sharp contrast to Thomas Alva Edison’s suggestion, many years before, during World
War I, to the Navy, that it should bring into the war effort at least one physicist in case it became necessary to ‘calculate something’ (Gilpin 1962, p. 10). For the first time in history military research and development (R&D) became a large-scale institutionalized process even in peacetime, indeed on a scale not seen before; it was legitimized as well as fueled by the climate of the Cold War. In the decades following the war, weapons were replaced in a rapid process of ‘planned obsolescence.’ The R&D was carried out in national laboratories, the defense industry, laboratories of the military services, and at universities to varying degrees in different countries. A United Nations study of 1981 estimated that annually some $100 billion, that is, some 20–25 percent of all R&D expenditures, were devoted to military R&D. The resulting ‘qualitative’ arms race in nuclear, conventional, and biological and chemical weapons between the NATO and Warsaw Pact countries during the Cold War raised the question of whether national and international security actually decreased, rather than increased, as a result of ‘destabilizing’ weapons innovations. A related question was whether military technological developments could be steered or directed so as not to undermine international arms control agreements. After the end of the Cold War in 1990, observers wondered why, particularly in the Western industrialized countries, military R&D efforts continued nearly unabated, while the original threat had disappeared. The systematic and organized involvement of science and technology in developing new armaments raised more questions of both societal and socialscientific interest. For instance, to what extent have military R&D and defense relations influenced academic research (mainly a concern in the USA) and the course—or even the content—of scientific and technological developments more generally? What is the relationship between (developments in) civil and military technology: are these separate developments or do they, on the contrary, profit from each other through processes of diffusion and spin-off? Addressing these questions has become an interdisciplinary task, drawing on and integrating insights from many fields. This challenge has been taken up, though still on a limited scale, within the framework of the science and technology (S&T) studies that have emerged since the 1970s. This article focuses on the origin and nature of ‘defense technology assessment studies’ and the related research on possibilities for influencing the weapons innovation process.
2. Defense Technology Assessment Studies The declared purpose of weapons innovation is enhancing national security (often broadly interpreted as 13699
Science, Technology, and the Military including intervention and power projection). However, during the Cold War many argued that the evercontinuing weapons innovation process was actually counter-productive. The issue was not only that the huge amounts spent on armament might be a waste of resources, but also that the huge military R&D effort and the resulting new weapons caused a rapid decrease rather than an increase of national security (e.g., York 1971, p. 228). Since the 1960s, many studies, often carried out by scientists who had become concerned about the escalating arms race, have dealt with the impact of new weapons systems and new military technologies on national and international security. These studies pointed out, for instance, that the anti-ballistic missile (ABM) systems, consisting of many land-based antimissile missiles, as proposed by the USA in the 1960s, would actually stimulate the Soviet Union to deploy even more nuclear missiles. Also, a similar ABM system by the Soviet Union would trigger the USA to deploy multi-warhead missiles (MIRVs—Multiple Independently Targetable Re-entry Vehicles). These missiles, carrying up to twelve nuclear warheads, could thus saturate the capabilities of the Soviet ABM interception missiles. Actually, the development and deployment of MIRVed missiles by the USA even preceded a possible Soviet ABM system. As the then US Defense Secretary, Robert McNamara, wrote (quoted in Allison and Morris 1975, p. 118): Because the Soviet Union might [emphasis in original] deploy extensive ABM defenses, we are making some very important changes in our strategic missile forces. Instead of a single large warhead our missiles are now being designed to carry several small warheads … . Deployment by the Soviets of a ballistic missile defense of their cities will not improve their situation. We have already [emphasis added] taken the necessary steps to guarantee that our strategic offensive forces will be able to overcome such a defense.
This weapons innovation ‘dynamics’ was aptly encapsulated by Jerome Wiesner, former science adviser to President John F. Kennedy, as ‘we are in an arms race with ourselves—and we are winning.’ In the 1990s, when the USA continued its efforts to develop anti-satellite (ASAT) technology capable of destroying an adversary’s satellites, Wiesner’s words might rightly have been paraphrased as ‘we are in an arms race with ourselves—and we are losing.’ For the irony here is that it is the US military system that, more than any other country’s defense system, is dependent on satellites (for communication, reconnaissance, eavesdropping, and so on), which would be highly vulnerable to a hostile ASAT system. The most likely route, however, for hostile countries to obtain the advanced ASAT technology would not be through their own R&D, but through the proliferation, that is, diffusion of US ASAT technology, once it had been developed. Again, the USA would very likely decrease 13700
rather than increase its own security by developing ASAT technology. Many of the early and later ‘impact assessments’ of weapons innovations assessed their potential of circumventing and undermining existing international agreements that aimed to halt the arms race, like the Anti-Ballistic Missile (ABM) Treaty (1972) and the accompanying Strategic Arms Limitation Agreements (SALT, 1972), the SALT II Treaty (1979), and a Comprehensive Test Ban Treaty (CTBT, 1996). In addition, assessments were made, both by independent scientists and governmental agencies, of the potential of civil technologies to spill over into military applications: for in such cases, countries could, under the guise of developing civil technologies make all preparations needed for developing nuclear, chemical, or biological weapons. Through these ‘dual-use’ technologies, a proliferation of weapons could occur, or at least the threshold lowered for obtaining those weapons, without formally violating the Non Proliferation Treaty (1971), the Comprehensive Test Ban Treaty (1996), the Biological Weapons Convention (1972), or the Chemical Weapons Convention (1993). The Arms Control readings (York 1973) from the Magazine Scientific American provide an instructive sample of early defense technology assessments. In the 1980s, both the USA and NATO emphasized the importance of a stronger conventional defense, which was then believed to be feasible because of new ‘emerging technologies,’ including sensor and guidance technologies, C$I (Command, Control, Communications and Intelligence) technologies (like real time data processing), electronic warfare, and a variety of new missiles and munitions. The emphasis on high technology in the conventional weapons area, in its turn, triggered a great variety of studies (pro and con) by academic and other defense analysts, not only in the USA, but also in Europe. These included analyses of the technical feasibility of the proposed systems, the affordability of acquiring sufficient numbers of ever more costly weapons, and the associated consequences for national defense and international security. The results of these defense technology assessments were fed into the public discussion, with the aim of containing the arms race and reaching international arms control agreements, or preventing their erosion. The impact of these assessments on the weapons innovation process and its accompanying military R&D was often limited. Many, therefore, considered the weapons innovation process to be ‘out of control,’ providing another topic of S&T studies on military technology.
3. Influencing the Weapons Innoation Process What does it mean to ‘bring weapons innovation under political control’? The concept seems obvious at
Science, Technology, and the Military one level but it is actually not trivial and needs elaboration. In national politics there is often no consensus on the kinds of armament that are desirable or necessary. Those who say that politics is not in control may actually mean that developments are not in accordance with their political preferences, whereas those who are quite content with current developments may be inclined to say that politics is in control. Neither position is analytically satisfactory. But neither is it satisfactory to say that politics is in control simply because actual weapons innovations are the outcome of the political process, which includes lobbying of defense contractors, interservice rivalry, bureaucratic politics, arguments over ideology and strategic concepts, and so forth, as Greenwood (1990) has suggested. Rather than speaking of control, one should ask whether it would be possible to influence the innovation process in a systematic way, or to steer it according to some guiding principle (Smit 1989). This implies that the basic issue of ‘control,’ even for those who are content with current developments, concerns whether it would be possible to change their course if this was desired. A plethora of studies have appeared on what has been called the technological arms race (see e.g. Gleditsch and Njolstad 1990). Many of them deal with what President Eisenhower in his much-cited farewell address called the military-industrial complex, later extended to include the bureaucracy as well. These studies belong to what has been called the bureaucratic-politics school or domestic structure model (Buzan 1987, Chap. 7), in contrast to the actionreaction models (Buzan 1987, Chap. 6), which focus on interstate interactions as an explanation of the dynamics of the arms race. A third approach, the technological imperative model (Buzan 1987, Chap. 8) sees technological change as an independent factor in the arms race, causing an unavoidable advance in military technology, if only for its links with civil technological progress—though a link whose importance is under debate (see also the last section of this article). By contrast, Ellis (1987), in his social history of the machine gun, has shown the intricate interweaving of weapons innovation with social, military, cultural, and political factors. To some extent, these studies might be considered complementary, focusing on different elements in a complex pattern of weapons development and procurement. For instance, the ‘reaction’ behavior in the interstate model might be translated into the ‘legitimation’ process of domestically driven weapons developments. Many of these studies are of a descriptive nature. Some (Allison and Morris 1975, Brooks 1975, Kaldor 1983) are more analytical and try to identify important determinants in the arms race. These factors are predominantly internal to each nation and partly linked up with the lengthy process—10 to 15 years—of the development of new weapons systems. Other
studies (Rosen 1991, Demchak 1991) relate military innovation, including military technology, to institutional and organizational military factors and to the role of civilians. There are quite a number of case studies on the development of specific weapons systems, and empirical studies on the structure of defense industries (Ball and Leitenberg 1983, Kolodziej 1987, Gansler 1980, 1987) or arms procurement processes (Cowen 1986, Long and Reppy 1980). However, hardly any of these studies focus on the question of how the (direction) of the weapons innovation process and the course of military R&D might be influenced. One task for future S&T studies, therefore, would be to combine insights from this great variety of studies for a better understanding of these innovation processes. Some steps have already been taken. The lengthy road of developing new weapons systems implies that it will be hard to halt or even redirect a system at the end when much investment has been made. Influencing weapons innovations, therefore, implies a continuous process of assessment, evaluation, and (re-)directing, starting at the early stages of the R&D process (see also Technology Assessment). Just striving for ‘technological superiority,’ one of the traditional guiding principles in weapons development, will lead to what is seemingly an autonomous process. Seemingly, because technology development is never truly autonomous. The appearance of autonomy results from the fact that many actors (i.e., organizations) are involved in developing technology, and no single actor on its own is able to steer the entire development. Rather, all actors involved are connected within a network—we may call it a sociotechnical network—working together and realizing collectively a certain direction in the weapons innovation process. Network approaches, appearing in several areas of science and technology studies since the mid 1980s, in which the positions, views, interests, and cultures of the actors involved are analyzed, as well as their mutual links and the institutional settings in which the actors operate, open up an interesting road for dealing with the question of influencing military technological developments (Elzen et al. 1996). Such approaches emphasize the interdependencies between the actors and focus on the nature of their mutual interactions (see also Actor Network Theory). From a network approach it is evident why one single organization by itself cannot determine technological developments. At the same time, network approaches have the power to analyze the way these developments may be influenced by actors in a network. Networks both enable and constrain the possibilities of influencing technology developments. Analyzing them may provide clues as to how to influence them. Weapons innovation and its associated military R&D differ in one respect from nearly all other technologies, in that there is virtually only one 13701
Science, Technology, and the Military customer of the end product—that is, the state. (Some civil industries, like nuclear power and telecommunications in the past, also show considerable similarities in market structure—monopolies or oligopolies coupled with one, or at most a few dominant purchasers. They are also highly regulated, and markedly different from the competitive consumer goods sectors (see Gummett 1990). Moreover, only a specific set of actors comprises the sociotechnical networks of military technological developments, including the defense industry, the military, the defense ministry, and the government. The defense ministry, as the sole buyer on the monopsonistic armament market, has a crucial position. In addition, the defense ministry is heavily involved in the whole R&D process by providing, or refunding to industry, much of the necessary funds. Yet the defense ministry, in its turn, is dependent on the other actors, like the defense industry and military laboratories, which provide the technological options from which the defense ministry may choose. S&T studies of this interlocked behavior offer a promising approach for making progress on the issue of regulatory regimes. Not steering from a central position, but adopting instead an approach of ‘decentralized regulation’ in which most actors participate then seems the viable option for influencing military technological developments. In this connection various ‘guiding principles’ could play a role in a regulatory regime for military technological innovation (see also Enserink et al. 1992). Such guiding principles could include the ‘proportionality principle,’ ‘humanitarian principles,’ and ‘limiting weapons effects to the duration of conflict’ (contrary, for instance, to the use of current anti-personnel mines). Additional phenomena that should in any event be taken into account in such network approaches are the increasing international cooperation and amalgamation of the defense industry, the possibly increasing integration of civil and military technology, and the constraining role of international arms control agreements. The intricate relation between civil and military technology will briefly be discussed in the final section.
4. Integration of Ciil and Military Technology Sociotechnical network approaches seem particularly useful for studying the relation between civil and military technology. This issue has assumed increasing interest because of the desire to integrate civil and military technological developments. Technologies that have both civil and military applications are called dual-use technologies. The desire for integration originates (a) from the need for lower priced defense products because of reductions in procurement budgets, and (b) from the new situation that in a number of technological sectors innovation in the civil sector has outstripped that in the military sector. Examples 13702
of such sectors are computers and information and communication technology, where such integration already emerges. Such integration, of course, could be at odds with a policy of preventing weapons proliferation as discussed before. Research on the transformations needed to apply civil technologies in the military sector and vice versa have only just begun. The extent to which civil and military technologies diverge depends not only on the diverging needs and requirements, but also on the different institutional, organizational, and cultural contexts in which these developments occur. Several case studies illustrate how intricately interwoven the characteristics of technology may be with the social context in which it is being developed or in which it functions. MacKenzie (1990) conducted a very detailed study of the development of missile guidance technologies in relation to their social context, and the technological choices made for improving accuracy, not only for missiles but also for aircraft navigation. He showed how different emphases in requirements for missile accuracy and for civil (and military) air navigation resulted in alternative forms of technological change: the former focusing on accuracy, the latter on reliability, producibility, and economy. A number of historical studies have addressed the question of the relation between civil and military technology. Some of them have shown that in a number of cases in the past, the military successfully guided technological developments in specific directions that also penetrated the civil sector. Smith (1985), for instance, showed that the manufacturing methods based on ‘uniformity’ and ‘standardization’ that emerged in the USA in the nineteenth century, were more or less imposed (though not without difficulties and setbacks) by the Army Ordnance Department’s wish for interchangeable parts. Noble (1985) investigated, from a more normative perspective, three technical changes in which the military have played a crucial role—namely, interchangeable parts manufacture, containerization, and numerical control— arguing that different or additional developments would have been preferable from a different value system. Studies going back to the nineteenth or early twentieth centuries, though interesting from a historical perspective, may not always be relevant for modern R&D and current technological innovation processes. Systematic studies of the interrelation between current military and civil technological developments are only of recent date (see Gummett and Reppy 1988). They point out that it may be useful to distinguish not only between different levels of technology, such as generic technologies and materials, components, and systems (Walker et al. 1988), but also between products and manufacturing or processing technologies (Alic et al. 1992). In certain technological areas the distinguishing features between civil and military technology may be found at the level
Science, Technology, and the Military of system integration, rather than at the component level. In conclusion one may say that military technology, for many reasons, is a fascinating field for future science and technology studies: its links with a broad scope of societal institutions, its vital role in international security issues, the need for influencing its development, and its increasing integration with civil technology, particularly in the sector of information and communication technologies, which may revolutionize military affairs. See also: Innovation: Organizational; Innovation, Theory of; Military and Disaster Psychiatry; Military Geography; Military History; Military Psychology: United States; Military Sociology; Research and Development in Organizations; Science and Technology, Social Study of: Computers and Information Technology
Bibliography Alic J A, Branscomb L M, Brooks H, Carter A B, Epstein G L 1992 Beyond Spinoff: Military and Commercial Technologies in a Changing World. Harvard Business School Press, Boston Allison G T, Morris F A 1975 Armaments and arms control: Exploring the determinants of military weapons. Daedalus 104(3): 99–129 Ball N, Leitenberg M 1983 The Structure of the Defense Industry. Croom Helm, London and Canberra Brooks H 1975 The military innovation system and the qualitative arms race. Daedalus 104(3): 75–97 Buzan B 1987 An Introduction to Strategic Studies: Military Technology and International Relations. Macmillan, Basingstoke, UK Cowen R 1986 Defense Procurement in the Federal Republic of Germany: Politics and Organization. Westview Press, Boulder, CO Demchak C C 1991 Military Organizations, Complex Machines. Modernization in the US Armed Serices. Cornell University Press, Ithaca, NY Ellis J 1987 The Social History of the Machine Gun. The Cresset Library, London. (Reprint of 1975 ed.). Elzen B, Enserink B, Smit W A 1996 Socio-technical networks: How a technology studies approach may help to solve problems related to technical change. Social Studies of Science 26(1): 95–141 Enserink B, Smit W A, Elzen B 1992 Directing a cacophony—weapon innovation and international security. In: Smit W A, Grin J, Voronkov L (eds.) Military Technological Innoation and Stability in a Changing World. Politically Assessing and Influencing Weapon Innoation and Military Research and Deelopment. VU University Press, Amsterdam, The Netherlands, pp. 95–123 Gansler J S 1980 The Defense Industry. MIT Press, Cambridge, MA Gansler J S 1989 Affording Defense. MIT Press, Cambridge, MA Gilpin R 1962 American Scientists and Nuclear Weapons Policy. Princeton University Press, Princeton, NJ Gleditsch N P, Njolstad O (eds.) 1990 Arms Races. Technological and Political Dynamics. Sage, London
Greenwood T 1990 Why military technology is difficult to constrain. Science Technology and Human Values 15(4): 412–29 Gummett P H 1990 Issues for STS raised by defense science and technology policy. Social Studies of Science 20: 541–58 Gummett P H, Reppy J (eds.) 1988 The Relations Between Defense and Ciil Technologies. Kluwer, Dordrecht, The Netherlands Hacker B C 1994 Military institutions, weapons, and social change: Towards a new history of military technology. Technology and Culture 35(4): 768–834 Kaldor M 1983 The Baroque Arsenal. Sphere, London Kolodziej E A 1987 Making and Marketing Arms: The French Experience and its Implications for the International System. Princeton University Press, Princeton, New Jersey Long F A, Reppy J (eds.) 1980 The Genesis of New Weapons. Pergamon, New York MacKenzie D 1990 Inenting Accuracy: A Historical Sociology of Nuclear Missile Guidance. MIT Press, Cambridge, MA Mendelsohn E, Smith M R, Weingart P (eds.) 1988 Science, Technology and the Military. Sociology of the Sciences Yearbook, Vol. XII\1. Kluwer, Boston Molas-Gallart J 1997 Which way to go? Defense technology and the diversity of ‘dual-use’ technology transfer. Research Policy 26: 367–85 Noble D F 1985 Command performance: A perspective on the social and economic consequences of military enterprise. In: Smith M R (ed.) Military Enterprise and Technological Change—Perspecties on the American Experience. MIT Press, Cambridge, MA, pp. 329–46 Rosen S P 1991 Winning the Next War. Innoation and the Modern Military. Cornell University Press, Ithaca Sapolsky H M 1977 Science, technology and military policy. In: Spiegel-Ro$ sing I M, de Solla Price D (eds.) Science, Technology and Society. A Cross-disciplinary Perspectie. Sage, London, pp. 443–72 Smith M R 1985 Army ordnance and the ‘‘American system’’ of manufacturing, 1815–1861. In: Smith M R (ed.) Military Enterprise and Technological Change—Perspecties on the American Experience. MIT Press, Cambridge, MA, pp. 39–86 Smit W A 1989 Defense technology assessment and the control of emerging technologies. In: Borg M ter, Smit W A (eds.) Non-proocatie Defence as a Principle of Arms Reduction. Free University Press, Amsterdam, The Netherlands, pp. 61–76 Smit W A 1995 Science, technology, and the military: relations in transition. In: Jasanoff S, Markle G E, Peterson J C, Pirich T (eds.) Handbook of Science and Technology Studies. Sage, London, pp. 598–626 Smit W A, Grin J, Voronkov L (eds.) 1992 Military Technological Innoation and Stability in a Changing World. Politically Assessing and Influencing Weapon Innoation and Military Research and Deelopment. VU University Press, Amsterdam The Annals of the American Academy of Political and Social Science 1989 Special Issue: Universities and the Military. 502 (March) Walker W, Graham M, Harbor B 1988 From components to integrated systems: Technological diversity and integration between military and civilian sectors. In: Gummett Ph, Reppy J (eds.) The Relations Between Defence and Ciil Technologies. Kluwer Academic Publishers, Dordrecht, The Netherlands, pp. 17–37
13703
Science, Technology, and the Military York H F 1971 Race to Obliion: A Participants View of the Arms Race. Simon and Schuster, New York York H F (comp.) 1973 Arms Control. Readings from Scientific American. Freeman, San Francisco
W. A. Smit
Scientific Academies, History of Scientific academies are associations of scientific practitioners like the Academie Royale des Sciences of Paris, the Royal Society of London, the Berlin Academy, the Russian Academy of Science, or the US National Academy of Science. Since their inception in seventeenth-century Europe, the distinctive feature of scientific academies has been the regulation of their membership and activities according to corporate protocols, statutes, and bylaws (which, nevertheless, may be developed in accordance with the policies of the state authorities that sometimes sponsor them). Typical activities of scientific academies include: the publication of journals, monographs, and the collected work of famous scientists, the award of prizes and medals (the Nobel Prize being the most famous example), the organization of scientific meetings and, more rarely, of research projects (sometimes in collaboration with other academies). They may also take an advising role in matters of science policy, and may be called upon to advise governments on scientific and technological projects. The sociological and historical significance of scientific academies is tied to their crucial role in the development of science as a peer-based system, in constituting the ‘scientist’ as a distinct socio-professional role, in establishing networks of communication and collaboration, and, more generally, in fostering the so-called internationalism of science (Crawford 1992). Historically, scientific academies have functioned as institutions demarcating the elite of scientific practitioners from other socioprofessional groups (which may have included other scientific practitioners pursuing different methodological programs or more practical subjects). By doing so, they have created the conditions of possibility for corporate authority in science (though, historically, different academies have taken different stances with respect to their role as judges of scientific and technical claims). They have also constituted themselves as crucial ‘lobbying’ entities in negotiations between corporate science and the state. But while academies are seen as a cornerstone in the development of science as a peer-based and peerregulated system, the very notion of peer has undergone important changes since the seventeenth century and remains contested even today. There is a circular relationship between the definition of peer and acad13704
emy. Academies are constituted by peers, but they also constitute what peer means. A peer is not only someone who has specific technical competencies in a given scientific field, but also a person who, through a process of training and professional socialization, has developed (and has been given the opportunity to develop) a specific corporate sociability and set of values. The ingredients of such a sociability are rooted in the historical, disciplinary, and socio-political context in which a given academy developed, but there are some features that have cut across these various contexts. One of them is gender. A historically conspicuous feature of scientific academies is their distinctly male corporate culture which, until the beginning of the twentieth century, led to the exclusion of women from their membership—an exclusion that was less marked in other less corporate sites of science (Schiebinger 1989). Another theme that runs through the literature on academies is their relative ‘independence.’ Some have seen academies as providing an institutional boundary between science and society, thereby shielding the allegedly ever-threatened ‘purity’ of science (BenDavid 1971). Others, instead, have taken the empirically more defensible view that academies are institutions constituted within specific socio-political ecologies—ecologies to whose development they contribute from within. The former view sees science as in need of being or becoming as independent as possible from the social nomos, while the latter treats it as a set of practices that have articulated increasingly finer and more pervasive relationships with the sociopolitical order precisely by developing more specialized and professionalized institutional forms. In one case, institutionalization means separation; in the other it means integration within an increasingly articulated interplay of social institutions. These considerations bear directly on the problem of defining ‘scientific academy.’ It is easy to compile a long and inclusive list of past or present academies, but it would be much harder, perhaps impossible, to categorize them as a long-standing institutional type like, say, the university. Depending on historical and national contexts, ‘scientific academy’ refers to different types of associations with different structures, functions, methodological orientations, membership requirements, notions of intellectual property and authorship, funding arrangements, and affiliations with private patrons or state governments and bureaucracies. Since their inception in seventeenth-century Europe, scientific academies have dealt with the production, reward, and dissemination of scientific knowledge, the training of practitioners, the evaluation of patent applications, and the advising of state and political authorities on scientific and techical matters. Some academies have been established by practitioners themselves and tend to be independently funded. Others are state-sponsored and tend to function as a body of scientific experts for the state that
Scientific Academies, History of supports them. Certain academies have a distinctly international membership, while others draw it from a specific country (though they may also extend limited forms of membership to a few foreign practitioners). Some select their members from one specific discipline, others gather practitioners from different fields. Some promote and fund the production of new knowledge, others concerns themselves with its dissemination. Most modern academies do not offer stipends to researchers (and may even require their members to contribute annual fees), while some earlier academies (Paris, Berlin, and St Petersburg) did provide their members with salaries and research funds (McClellan 1985). Because scientific academies have assumed most of the functions now typical of other institutions, it is possible to find partial analogies between academies and these other components of the social system of science. For instance, it may be difficult in some cases to draw a sharp demarcation between scientific academies and other institutions such as discipline-based professional associations, national laboratories and research institutes, foundations (like Rockefeller in the US), national agencies for the development of science (CNRS in France, NSF or NIH in the US, CNR in Italy, etc.), or even university science departments. Similar problems emerge when we try to distinguish scientific academies from other less scientific-sounding associations which, nevertheless, promote or promoted the production of knowledge about nature. In this category we find early modern philosophical salons (usually established and run by women (Lougee 1976)), religious orders that supported collective work in mathematics, astronomy, and natural history (like the Society of Jesus), provincial academies of natural philosophy, belles lettres, and history (common in Europe and America throughout the nineteenth century), nonprofessional associations for the popularization of science (which may be seen as the modern heirs to earlier provincial academies), or contemporary associations of amateur scientists. The same taxonomical difficulties emerge when one looks for some essential demarcation between academies and collection-based institutions like the Smithsonian Institution in Washington and other museums of science and natural history, and botanical gardens. In the premodern period, such institutions tended to be associated with academies and, as shown by the American Museum of Natural History in New York, may still perform some of the research functions of their historical ancestors. The same may be said about astronomical observatories. Many of them (Greenwich, Paris, Berlin, St Petersburg, etc.) were developed within scientific academies or soon became annexed to them, but later evolved into independent institutions or became linked to other state agencies (like the Bureau de Longitude in nineteenth-century France). In sum, one either runs the risk of treating ‘academy’ as a synonym for ‘scientific institution’ (in the same
way some apply the term ‘laboratory’ to anything from large-scale international organizations like CERN and private industrial entities like Bell Labs, to early modern alchemical workshops or to Uraniborg—Tycho Brahe’s sixteenth-century aristocratic observatory-in-a-castle (Christianson 2000)) or, in a reverse move, of identifying academies solely with institutions bearing that name (but likely to display remarkably different structures and functions), while leaving out a slew of other associations whose activities may in fact overlap with those pursued, at one point or another, by ‘proper’ academies. A possible solution to this definitional puzzle is to take a genealogical (rather than taxonomical) approach. In any given historical and national context there is a division of labor and functions between scientific academies and the surrounding institutions of science and of the state bureaucracy. It is this institutional ecology that we need to look at to understand which role academy fulfills in any given context. If we take peer-based management to be the distinctive trait of academies, then a reasonable starting point of their genealogy is the development of legal statutes and bylaws; that is, of instruments for the establishment of their corporate form of life. This makes the Royal Society of London (chartered in 1662) the first scientific academy. This criterion excludes a number of well-known seventeenth-century academies (such as the Lincei (1603–1631), the Cimento (1657–1667), and even the early Academie Royale des Sciences (which began to gather around 1666 but obtained its first statute only in 1699). Academies that did not have legal statutes or charters gathered at the will of a prince who sponsored them, and operated within the rules and system of exchange typical of the patronage system. They can be seen as the extension of the model of informal court gatherings and aristocratic literary salons to the domain of natural philosophy. They were princedependent not only in the sense that they were funded by princes, but their work, publications, membership, and sociability were structured by their direct, quasipersonal relationship to the patrons who, in several cases, participated in them. In these contexts, academicians were more princely subjects than collaborating scientific peers (Biagioli 1996). The next step—the development of charters, statutes, and bylaws—marked the transition from the patronage framework to institutions which, while still connected to royal power, had a more formal and bureaucratic relationship to the state. This institutional development followed two different models, one exemplified by the Royal Society of London, the other by the Academie Royale des Sciences. These two models matched quite closely the structure and discourse of royal power in those two countries: absolutism in France and constitutional monarchy in England. 13705
Scientific Academies, History of In fact, despite its name, the Royal Society was a private institution (some may say a large gentlemen’s club) that, while chartered by the king, received negligible funding and virtually no supervision from the crown (Hunter 1989). Its members were nominated and approved by the Society itself. Similar protocols were followed for the selection of the president and other officials. There were no paid positions except that of the secretary and the curator of experiments, and the operating budget came mostly from annual membership fees which were typically hard to collect. The training of young practitioners was not one of the Society’s stated goals. The Academie des Sciences, instead, was much more of a state agency. The French crown funded it, provided stipends, and closely controlled its projects, membership, and publications. Its members were grouped according to disciplinary taxonomies, and organized hierarchically according to seniority (with students at the bottom) and, sometimes, social status. The king selected crown bureaucrats to run its operation (Hahn 1971). These two models and their various permutations were adopted during the remarkable development (almost an explosion) of scientific academies in the eighteenth century. Driven by a mix of national pride, emulation, and the state’s growing appetite for technical expertise, virtually all European states, from England to Russia, developed such academies (McClellan 1985). Academies proliferated not only across but within nations, moving from the capitals to the periphery (Roche 1978). Reflecting the logic of political absolutism, state academies tended to follow the French paradigm, while provincial and private academies, having little or no connections to central state power, looked to the Royal Society as their primary model. The center–periphery distinction maps quite well on the disciplinary scope of early scientific academies: provincial institutions often included literature and history sections, while centrally located academies tended to specialize in natural philosophy. Until the French Revolution, academies constituted the primary site of scientific activity, epitomized the very notion of scientific institution, and provided science with a kind of social and professional legitimation it had struggled to achieve in previous periods. Some military schools (especially in France) shared the academies’ concern with research. But with the exception of Bologna, Pavia, and Gottingen, early modern universities usually limited themselves to the teaching of (not the research in) natural philosophy. If the eighteenth century was the golden age of scientific academies, the nineteenth century saw the beginning of their decline, or at least their slow but steady reframing from research institutions into prestige-bearing ones. New kinds of institutions emerged and older ones were reformed as the social system of science developed in scale and specialization. The 13706
remarkable complexity of this scenario is such that only a few general trends can be discussed here, and only in reference to their impact on the role of academies. First, starting with Germany, the restructuring of the university led to an increased representation of the sciences in its curriculum and, more importantly, to a slow but steady trend toward research, not only teaching, in the natural sciences. In the second half of the nineteenth century, this trend was followed by other countries including the US, where new universities (first Johns Hopkins and then the so-called landgrant state universities) were developed with an eye on European research models—models that were slowly adopted by Ivy League universities as well. This trend eroded much of the academies’ pedagogical and research role. Second, with the slow but progressive rise of sciencebased industry, the private sector (often in collaboration with universities) provided more niches for research and for the employment for fast-growing cadres of scientists. Widespread industrial development eroded another aspect of the academies’ traditional function: that of judge of technological innovation. The emergence of the patent system, first in England but, after 1800, in most other European countries and the US, placed the reward and protection of technological innovation in the domain of the law and of the market and disconnected it from its authoritative assessment by academies (as had been the case in early modern France and other countries). Third, the establishment or growing role of state or military technical agencies and schools (like, say, the Corps of Engineers and Geological Survey in the US, the Ecole Politechnique and the Ecole des Ponts et Chaussees in France) greatly reduced the role of scientific academies as providers of technical and scientific expertise to the state (Shinn 1980). Fourth, with the branching of science in increasingly numerous disciplines, a slew of professional associations emerged and, with them, increasingly specialized journals. Not only did these discipline-based associations erode the academies’ quasi-monopoly on scientific publishing and translations, but they also took up much of their function as international nodes of communication, collaboration, and standardization of methodologies, terminology, and units of measurement. The ‘internationalism’ of science which largely had been connected not only to networks of communication among academies but also to the export and import of academicians across national boundaries (most conspicuously in the eighteenth century) was replaced by more specific disciplinary and professional networks and by the migration of students (and later post doctorates) between universities and, later, institutes and laboratories. Similarly, the academies’ role in the diffusion of scientific knowledge was eroded by these trends, as well as by the development of a wider nonprofessional audience for science and of
Scientific Academies, History of the popular publications and magazines that catered to them. All these trends continued and were further accelerated in the twentieth century, when the additional creation of national agencies for the promotion and funding of science and the growing role of private foundations in the patronage of science took on yet another role that had been traditionally played by academies. Even large-scale, state-sponsored technical and military projects (which, historically, had seen the direct or indirect participation of scientific academies) have become commonly structured around collaborations between the state, the university, and industry. International scientific collaborations are also primarily managed by universities and national science agencies and, in the case of science-based industry, through joint ventures, licensing, patent sharing, or mergers and acquisitions. (These trends, however, are typical of Western contexts. In the former Soviet Union, the Soviet Academy of Science played a dominant role in research, training, and rewards at a time when its western counterparts had largely shed those functions (Vucinich 1984).) In sum, it is possible to identify at least three phases in the genealogy and role of academies within the broader ecology of the social system of science: (a) patronage-based princely academies without statutes and corporate protocols up to the end of the seventeenth century; (b) peer-based academies with statutes, publications, and, in many cases, formalized relations with the state since the late seventeenth century, and mostly during the eighteenth century; and (c) dramatic increase of the scale and complexity of the social system of science associated with the multiplication of more specialized scientific institutions taking up most of the traditional roles of scientific academies (since the early nineteenth century). The net result of this historical trajectory is that academies have moved from being informal gatherings structured by the curiosity and philosophical interests of princes and gentlemen, to comprehensive and multifunctional corporate entities crucial to the production and legitimation of science and its social system, to prestigious institutions that are marginally involved in research but still maintain an important advising role to the state and other constituencies, and bestow professional recognition on leading scientists through membership and awards. In many ways, the changing role of prizes awarded by academies encapsulates this trajectory: In the early modern period, prizes were seen as means to promote new research (not unlike today’s grants) but now they function as rewards of important work that scientists have done in the past. The importance of the largely symbolic role of modern scientific academies, however, should not be underestimated. In an age when ‘lobbying’ has become the dominant mode of corporate participation in
political decisions (especially in the US), academies remain the authoritative agents of corporate science, no matter how accurately they may be able to represent the interests and concerns of such a large and diversified community. See also: Centers for Advanced Study: International\Interdisciplinary; History of Science; History of Science: Constructivist Perspectives; Scientific Disciplines, History of; Universities, in the History of the Social Sciences
Bibliography Ben-David J 1971 The Scientist’s Role in Society: A Comparatie Study. Prentice-Hall, Englewood Cliffs, NJ Biagioli M 1996 Etiquette, interdependence, and sociability in seventeenth-century science. Critical Inquiry 22: 193–238 Boehm L, Raimondi E 1981 Uniersita’ Accademie e Societa’ Scientifiche in Italia e in Germania dal Cinquecento al Settecento. Il Mulino, Bologna, Italy Cahan D 1984 The institutional revolution in German physics, 1865–1914. Historical Studies in the Physical and Biological Sciences 15: 1–65 Cavazza M 1990 Settecento inquieto: Alle origini dell’Istituto delle Scienze di Bologna. Mulino, Bologna, Italy Christianson J R 2000 On Tycho’s Island. Cambridge University Press, Cambridge, UK Crawford E 1992 Nationalism and Internationalism in Science, 1880–1930: Four Studies of the Nobel Population. Cambridge University Press, Cambridge, UK Crosland M 1992 Science Under Control: The French Academy of Sciences, 1795–1914. Cambridge University Press, Cambridge, UK Daniels G 1976 The process of professionalization of American science: The emergent period, 1820–1860. In: Rheingold N (ed.) Science in America Since 1820. Science History Publications, Canton, MA, pp. 63–78 Fox R, Weisz G 1980 The Organization of Science and Technology in France, 1808–1914. Cambridge University Press, Cambridge, UK Frangsmyr T (ed.) 1989 Science in Sweden: The Royal Swedish Academy of Science, 1739–1989. Science History Publications, Canton, MA Frangsmyr T (ed.) 1990 Solomon’s House Reisited. Science History Publications, Canton, MA Graham L 1967 The Soiet Academy of Sciences and the Communist Party, 1927–1932. Princeton University Press, Princeton, NJ Hahn R 1971 The Anatomy of a Scientific Institution. University of California Press, Berkeley, CA Heilbron J L 1983 Physics at the Royal Society During Newton’s Presidency. William Clark Memorial Library, Los Angeles Hunter M 1989 Establishing the New Science. Boydell, Woodbridge, MA Jasanoff S 1990 The Fifth Branch: Science Adisers as Policymakers. Harvard University Press, Cambridge, MA Kevles D 1978 The Physicists: The History of a Scientific Community in Modern America. Knopf, New York
13707
Scientific Academies, History of Knowles Middleton W E 1971 The Experimenters: A Study of the Academia del Cimento. Johns Hopkins University Press, Baltimore, MD Kohlstedt S 1976 The Formation of the American Scientific Community. University of Illinois Press, Urbana, IL Lougee C 1976 Le Paradis des Femmes: Women, Salons, and Social Stratification in Seenteenth-century France. Princeton University Press, Princeton, NJ MacLeod R, Collins P (eds.) 1981 The Parliament of Science: The British Association for the Adancement of Science, 1831–1981. Kluwer, Northwood, UK McClellan J E 1985 Science Reorganized. Columbia University Press, New York Moran B 1992 Patronage and Institutions. Boydell, Woodbridge Morrell J, Thackray A 1981 Gentlemen of Science: Early Years of the British Association for the Adancement of Science. Clarendon Press, New York Oleson A, Brown S C (eds.) The Pursuit of Knowledge in the Early American Republic: American Scientific and Learned Societies from Colonial Times to the Ciil War. Johns Hopkins University Press, Baltimore, MD Pyeson L, Sheets-Pyeson S 1999 Serants of Nature. Norton, New York Roche D 1978 Le Siecle de Lumieres en Proince. Mouton, Paris Schiebinger L 1989 The Mind Has No Sex? Harvard University Press, Cambridge, MA Shinn T 1980 Saoir scientifique et pouoir social: L’Ecole Polytechnique, 1794–1914. Presses de la Fondation Nationale des Sciences Politiques, Paris Vucinich A 1984 Empire of Knowledge: The Academy of Sciences of the USSR, 1917–1970. University of California Press, Berkeley, CA
The criterion for selection of the institutions described in this article is that academic research is their main function. Hence, academies that focus mainly on teaching, and policy-oriented research institutes, are not included. Taiwan’s Academia Sinica is examined in terms of organization and research accomplishment, as well as its role in society. Similar research academies in mainland China and in Japan are also briefly introduced.
presidents since its founding, and the current president, Dr. Lee Yuan-Tseh, a Nobel laureate in chemistry, took office in 1994. The upper-level organization of the Academia Sinica comprises three parts: the convocation, the council, and the central advisory committee. The convocation is a biennial meeting attended by preeminent Chinese scholars from all over the world who are elected to be academicians of the Academia Sinica. The year 2000 saw the 26th convocation and most of the 193 academicians—an honorary lifetime title— gathered to elect new academicians and council members. They also proposed research policies for the Academia Sinica. The council consists of 18 ex officio members, including all directors of research institutes, and 35 elected members. Specific functions of the council include review of research projects, evaluation of proposals related to institutional changes, promotion of academic cooperation within and outside Taiwan and, perhaps most importantly, to present a shortlist of three candidates for the presidency of the Academia Sinica to the President of the Republic of China, who make the final decision. The central advisory committee, established in 1991, includes chairpersons of the advisory committees of individual institutes and three to nine specialists nominated by the president of the Academia Sinica. Their tasks are to recruit scholars of various disciplines as well as to suggest long-term and interdisciplinary research plans to the president. The committee is also responsible for evaluating large-scale cross-institutional research projects, applications for postdoctoral research posts, and the annual awards for junior researchers’ publications. But the core of the Academia Sinica is made up of the 25 institutes\preparatory offices classified into three divisions: Mathematics and Physical Sciences, Life Sciences, and Humanities and Social Sciences (see Table 1). In 2000, 815 research staff (including 13 distinguished research fellows, 324 research fellows, 197 associate research fellows, 138 assistant research fellows, and 143 research assistants) conducted active research either individually or in groups within, as well as across, institutions. In addition to the research staff, postdoctoral researchers, contracted research assistants, and administrative officers add up to approximately 3,000 people working in the Academia Sinica.
1. The Academia Sinica, Taiwan
1.2 Research Focus and Achieements
M. Biagioli
Scientific Academies in Asia
1.1 Organization The Academia Sinica was founded in mainland China 1928 in with two major missions: to undertake research in science and the humanities and to instigate, coordinate and encourage academic research. In 1949, after the Chinese civil war, the Academia was moved to Taiwan and re-established at the present site in 1954. The Academia has been headed by seven 13708
There are six fundamental principles or basic goals guiding academic research in Academia Sinica (Lee 1999) Examples of research accomplishments, landmark research projects and significant publications are discussed below. 1.2.1 A balance between scientific and humanitarian research. Critics have condemned the dominance of
Scientific Academies in Asia Table 1 Number of research staff at Academia Sinica, Taiwan: 1994–2000 Researchnstitutes Division of Mathematics & Physical Sciences Institute of Mathematics Institute of Physics Institute of Chemistry Institute of Earth Sciences Institute of Information Sciences Institute of Statistical Sciences Institute of Atomic and Molecular Sciences Institute of Astronomy and Astrophysicsa Institute of Applied Science & Engineeringa Subtotal No. increase Rate of increase (percent) Division of Life Sciences Institute of Botany Institute of Zoology Institute of Biological Chemistry Institute of Molecular Biology Institute of Biomedical Sciences Institute of BioAgricultural Sciencesa Subtotal No. increase Rate of increase (percent) Division of Humanities & Social Sciences Institute of History and Philology Institute of Ethnology Institute of Modern History Institute of Economics Institute of American and European Studies Sun Yat-Sen Institute for Social Sciences and Philosophy Institute of Sociology Institute of Chinese Literature and Philosophya Institute of Taiwan Historya Institute of Linguisticsa subtotal No. increase Rate of increase (percent) Total No. increase Rate of increase (percent) a
1994
1995
1996
1997
1998
1999
30 41 29 37 28 29 21 1
33 41 28 36 29 31 22 2
32 39 27 35 30 31 22 4
34 39 27 34 30 32 22 7
32 42 28 37 32 30 22 10
31 43 30 37 34 32 24 10
216
222 6 2.78
41 28 30 80 51 230
41 30 31 84 48
220 k2 k0.90 41 30 33 82 49
234 4 1.74
235 1 0.43
225 5 2.27 39 29 34 79 54 235 0 0.00
233 8 3.56 38 30 35 82 54 239 4 1.70
2000
241 8 3.43
31 42 27 34 36 32 24 13 2 241 0 0.00
39 29 34 85 55 8 250 11 4.60
40 27 36 87 54 11 255 5 2.00
65 39 53 48 35 48
66 40 51 48 34 48
66 25 51 49 37 44
69 27 49 49 34 47
60 28 48 49 34 45
58 27 49 46 33 47
56 27 46 46 32 46
14
16
17 17
18 19
19 17
21 22
18 21
1
6
10
13
303
309 6 1.98
316 7 2.27
325 9 2.85
14 12 326 1 0.31
749 16 2.14
765 6 0.78
771 14 1.82
785 13 1.66
798 22 2.76
15 11 329 3 0.92 820 k5 k0.61
16 11 319 k10 k3.04 815 66
Preparatory Office
scientific over humanitarian research in Taiwan. This imbalance came about because of the general need to modernize the country’s economy in the postwar years. During this process, most resources were allocated to technology-oriented research. As a result, natural science and the life sciences enjoyed a major share of manpower as well as of budget. In later years, especially after the lifting of martial law on
Taiwan in the 1980s, the environment for social sciences research improved greatly. The humanities and social sciences (at least in the Academia Sinica) have received relatively fair treatment if adequacy of regular budget is used as an indicator. This is perhaps a rare phenomenon worldwide. The Academia Sinica claims to have achieved balanced development among its three divisions: 13709
Scientific Academies in Asia Mathematics and Physical Sciences, Life Sciences, and Humanities and Social Sciences. This has been achieved by setting different criteria for the various divisions with regard to both their research direction and evaluation, and by building up a lively academic community that allows researchers from various divisions to work on the same topic. A recently completed study on Taiwanese aboriginals is one example of such a project. Under the coordination of a researcher from the Institute of Ethnology, staff from Biomedical Sciences, along with scholars from universities and hospitals, jointly investigated the migratory history, the hereditary traits from blood samples, and the ethnic differences of aboriginals in order to establish possible genetic polymorphic markers. This type of joint endeavor by different disciplines continues and is given priority in the funding process. An ongoing study of cardiovascular-related disease, the second leading cause of death in Taiwan, is another example. The study intends to collect multifaceted community data to help combat this disease. Funded by the Academia Sinica, researchers in epidemiology, social demography, economics and statistics have formed an interdisciplinary team to tackle the issue from various angles.
1.2.2 A balance between indigenous and international research. In the late 1990s the question of the relative priority of indigenization versus internationalization of academic research was much debated among social scientists in Taiwan. Supporters of indigenization emphasized the particularistic or the unique aspect of social science studies in Taiwan and the importance of avoiding the influence of dominant Western models. To others, however, internalization of social sciences is regarded as an inevitable global trend fitting into the theme of ‘knowledge without national boundaries.’ Regarding debates on the nature of the research, the Academia Sinica takes a balanced stand. On the one hand, it encourages active participation in the valued conventional research areas. On the other hand, focusing on Taiwan’s particular social issues and disseminating relevant research findings is considered to be important for the intellectual community and for mankind in general. Hence, the Academia Sinica has funded large-scale research projects that have both of the above purposes in goal. A recent group project on the long-term development of capitalism in Taiwan has the potential to extend the Taiwanese experience to other Asian economies. This study encompasses the history of agricultural and industrial developments in Taiwan, trade and navigational expansion, macroeconomic performance, and the role the Taiwan Development Company played in the consolidation of capitalism in 13710
the territory. From the colonial era to the period of Japanese rule and the postwar era, Taiwan has gone through significant social transitions in its capitalist development. Although each stage may be characterized by different sets of institutions, one common factor emerges from the historical process: the expansion of exports (from tea, sugar, rice, processed foods, to light manufactured goods and producer goods). The exploration of the origin and the evolution of capitalist development in Taiwan will not only benefit the local academies, but will enhance the comparative study of economic development in other countries.
1.2.3 A balance between basic and applied research. Academic research is the main function of the Academia Sinica. However, increasing demands are being made on it to provide research results for applied use. The Academia Sinica recognizes the necessity to respond to important social phenomena and the need for technological development in its research endeavors. Heavy emphasis has been placed on the implementation of findings from basic research. An equal amount of effort has also been allocated to promoting applied research that may shed light on the consequent academic research. Two illustrations from Information Science and Life Science will highlight this recent focus. A study on natural language understanding is directed towards the construction of a computer program with a knowledge system that is capable of understanding human perception of various recognition systems. The project has successfully developed a concept recognition mechanism called the ‘Information Map.’ This map arranges human knowledge in a hierarchical fashion with a cross-referencing capability. Using the information map, concept understanding can be reduced to intelligent semantic search in the knowledge system. This project has already produced a semantic ‘Yellow Page’ search for a telephone company and an automatic question and answer agent on the internet. A popular product of its Chinese phonetic input system is a typing software package widely used by the public. Another wellknown research project with excellent applied values is the method for detecting differentially expressed genes. The research group has developed a DNA microarray with colorimetric detection system to simultaneously monitor the expression of thousands of genes in a microarray format on nylon membrane. Testing on filter membranes and quantifying the expression levels of the target genes in cells under different physiological or diseased states will reduce each screening process to a couple of days. It is clear that the applied research is basically restricted to non-social sciences. In social science divisions, market-oriented or consumer-involved
Scientific Academies in Asia studies remain quite rare. Researchers are mostly committed to basic research funded by the institute or by the National Science Council. Although policy study is ostensibly given a high priority, the fact that institutes to be established in the near future will still specialize in the basic disciplines reflects a general emphasis on this area. At present, few policy-oriented research projects are undertaken by individuals. 1.2.4 Coordination and promotion of academic research in Taiwan. Under the new organizational law, which allows more flexibility for research institutes to establish ad hoc research centers, the Academia Sinica will be given the responsibility of proposing a national academic research agenda, coordinating academic research in Taiwan, and training intellectual manpower. These tasks reflect the central role of the Academia Sinica in Taiwanese academies. In order to meet these requirements, the Academia has attempted to collaborate with various universities by exchanging teaching and research staffs. Ad hoc research committees and specific research programs including scholars from different institutes have also been established. The committees on Sinological Research, on Social Problems, on Taiwan Studies, and on Mainland China Studies—all established since 1997—exemplify such an effort. These committees are interdisciplinary in nature, and comprise scholars from within as well as outside the Academia Sinica. The committee may focus on a few selected research issues and organize related seminars\conferences. The committee is also allowed to form various taskforces to plan future collaborative research topics. Besides the ad hoc research committees, the promotion of large-scale cross-institutional research projects has become important to the Academia Sinica. A so-called ‘thematic project’ will share a common research framework and will include several individual research topics proposed by researchers from different institutes and universities. The study of the organization-centered society represents one of these group efforts. In the investigation of modern social structure, nine subprojects were proposed, all funded exclusively by the Academia Sinica. Their findings on the importance of impersonal trust in Taiwan’s economic development—instead of the traditional interpersonal trust— as key to organizational success, has given rise to the future research perspective that a modern society such as Taiwan is organization-centered. Similar thematic projects aiming to promote collaborative work among various academic institutes have been encouraged by the granting of funds. However, it should be pointed out that playing a central role does not equate to having the central planning function. Taiwanese social scientists, in comparison with life scientists or Japanese colleagues, have a strong inclination to pursue individual re-
searches. Various researchers joining together in a large communal laboratory or generations of scholars working in the same topic are not common at all. The thematic project in the Academia Sinica, or the joint project promoted by the National Science Council, is more of an initiative to encourage collaborative teamwork rather than a reaction to the present research demand. Whether individual projects maintain their importance or are replaced by group efforts will not change the expectations placed on the Academia Sinica—to respect individual research freedom and to facilitate research needs in Taiwan.
1.2.5 Encouragement of international collaboration. Active participation in international research has always been a priority in Taiwan. Researchers are encouraged to present their findings at international meetings and to build collaborative relationships with foreign research groups. Renowned scholars abroad are also frequently invited to visit and to work with research staff at the Academia Sinica. In line with this trend, a proposal has been made to establish an international graduate school at Academia Sinica (Chang 2000). The aim of this program is to attract highly qualified young people to do their Ph.D. degree under the guidance of top researchers in Taiwan. The competitive program is intended to provide the opportunity for independent inquiry as well as dynamic peer interaction; it is assumed that the supportive research environment will facilitate the training of future intellectual leaders and creative scholars. Although this program is yet to be finalized, the Academia Sinica has clearly revealed its interest in taking a concrete step towards globalization by investment in brilliant young minds.
1.2.6 Feedback into society. It is the firm belief among academicians that any type of feedback to society from members of the Academia Sinica must be based on solid academic research. Several feedback patterns have been adopted. With regard to emergent social issues, researchers with related knowledge and research findings are encouraged to air their suggestions by organizing conferences or public hearings. The problem of the over-plantation of betel nuts on hills and mountains and their harmful effect on the environmental protection is an example. Short-term research projects—such as the analysis of juvenile delinquency initiated by the committee on social problems—are another possible strategy. Furthermore, the Academia Sinica opens its campus annually and individual research institutes sponsor introductory lectures to interested students and visitors. Numerous data sets from social science or biological researches, as well as valuable original historical files, are gradually released for public use. 13711
Scientific Academies in Asia 1.3 The Role of the Academia Sinica Whether the Academia Sinica should play a role beyond pure academic research and beyond researchbased feedback to various social problems has always hotly debated. Some of the more articulate members have been quite vocal in insisting that it should have a stronger role in economic and social life. When it comes to how far the institute should involve itself in politics, however, the issue is more delicate. The subject is closely linked with the role of contemporary intellectuals in Taiwan (Chang 2000), where it is considered permissible for intellectuals to vocalize their moral conscience with regard to significant political issues. Nevertheless, it is precisely because most intellectuals are respected for their professional scholarship, not necessarily the correctness of their political views, that the appropriateness of actual political participation is seriously questioned. For most ordinary members of the Academia Sinica, the pressure to produce excellent research is the foremost stimulus. Academic ambition is factored into the evaluation of promotion, the review criteria for assessing institutes’ research accomplishment, the process of planning new research development, and in the regular report to the Legislative Yuan. Nevertheless, considering that research staff are government employees, the public perhaps has a right to voice concerns about the public utility of the Academia Sinica, whatever its academic credentials and quality of research. When challenged about its value to the Taiwanese taxpayer, the Academica Sinica usually reminds the public of its past research accomplishments and its dynamic future role, concrete evidence of which can be found in several newly established research institutes. When it comes to past achievements, the most senior institute in the Humanities and Social Science division—the Institute of History and Philology—is often cited. It was established in the same year as the Academia Sinica (1928). Early collective projects such as the An-Yang excavation, the study of Chinese dialects, and the reconstruction of ancient histories gained international fame for the institute. The institute is also engaged in the systematic compilation and organization of valuable Chinese historical documents, which contributes enormously to the field of Sinology and further enhances its academic status. Far from resting on the academic achievements of the past, the Academia Sinica is constantly trying to stay on the cutting edge of research, as can be seen by the recently established institutes. In the division of Mathematics and Physical Sciences, the Institute of Astronomy and Astrophysics (1993) and the Institute of Applied Science and Engineering Research (1999) are the two latest research institutes. The separation between pure and applied science, especially in the life sciences, is obviously not applicable any more. Within the division of Humanities, the Institute of Taiwan 13712
History (1993) and the Institute of Linguistics (1997) were formed after drastic social changes had taken place in Taiwan. The Taiwan History institute is in the forefront of indigenous research in Taiwan and it has become the focal coordinating agency for Taiwan studies. In short, Taiwan’s Academia Sinica is a government-sponsored research institute. With funding available from the regular budget, academic research has been its main prescribed task. The highly trained research staff represents the research elite in Taiwan and has full liberty in deciding on individual projects. In recent years, the Academia Sinica has made a conscious effort to promote major interdisciplinary research programs both in fulfillment of its leadership role in the Taiwanese academies and as a response to changing societal expectations. A review of recent developments within the Academia Sinica reveals its intention to gain a greater global profile on the basis of its academic performance and generous research resources.
2. Research Academies in Mainland China 2.1 The Chinese Academy of Sciences The Chinese Academy of Sciences was founded in 1949, the same year in which the Academia Sinica moved to Taiwan. With basic research as its main task, this academy has perhaps the largest organization of any institution of its type in the world. Besides the headquarters in Beijing, 22 branches made up of no fewer than 121 research institutes are scattered all over the country. Among five academic divisions (Mathematics and Physics, Chemistry, Biological Science, Earth Science, Technological Science), more than 40,000 scientists and technical professionals work for the Academy. Among them, nearly 10,000 are regarded as basic research staff, and 230 members of the Academy (out of the current 584 members who are elected as the most preeminent scientists in mainland China) also actively engage in research at the Academy. Members of the Academy enjoy the highest academic prestige. They play a planning and consulting role in China’s scientific and technological development as well as providing reports and suggestions regarding important research issues. A review of the general research orientation of the Chinese Academy of Sciences reveals at least two important characteristics which may differentiate it from the Academia Sinica in Taiwan. First, basic research with highly applied values was a major priority of the Academy from the beginning. Hightech research and development centers are growing rapidly and the support staff composed of well-trained professionals has become a major facilitating force. Second, collaboration with industrial sectors and with foreign institutes has contributed to the Academy’s
Scientific Academies in Asia research resources. The cooperative relationship has been substantial and extensive in that more than 3,000 enterprises have joined the industry–education– research development program. The international exchange program involves 7,000 personnel annually. This program has benefited both the research staff and the postgraduates of the Academy and fulfilled an important training function: more than 14,000 staff and graduate students have received advanced training abroad since 1978 and over 8,000 have completed their studies and returned to the Academy.
2.2 The Chinese Academy of Social Sciences The Chinese Academy of Social Sciences was formally established in 1977 from the former Philosophy and Social Sciences division of the Chinese Academy of Sciences. The central headquarters in Beijing is made up of 14 research institutes employing 2,200 staff. Among these centralized institutes are Economics, Archeology, History, Law, and Ethnology. As is the case with the Academy of Sciences, there are many branch institutes throughout China so that the staff complement totals 4,200 in 31 research institutes. According to much of the publicity material on the Academy of Social Sciences, the needs of the country appear to be of the utmost importance in the selection of research projects. The material and spiritual development and the democratization of the nation are constantly cited as basic motives to conduct relevant studies. This bias probably owes more to the fact that funding is from central government than to any policy implications. For example, the national philosophy and social sciences committee has organized several selected research topics every five years, such as the study of changes among rural families under the economic reform coordinated by Beijing University during the 7th national Five-year Plan. A substantial proportion of research undertaken by the Academy of Social Sciences consists of special commissions of this sort. Because the availability of funding is the key to the commencement of any research, there is a substantial reliance on foreign funding. Funding from foreign foundations is usually generous enough to greatly enhance the possibility of conducting extensive studies across different regions of the nation. But collaboration with foreign institutes, especially in social science surveys, tends to be limited to data collection. Highly quality academic written manuscript from the collaborative project in Academy of Social Sciences is relatively inadequate and waits to be promoted in the future. As in the Academy of Sciences, there is a longstanding international academic exchange program in the Academy of Social Sciences. More than 4,100 research staff and graduate students have participated in this program since 1978 and positive outcomes are
revealed in new research projects as well as publications. The 82 academic journals published by the Academy cover various disciplines such as sociology, law, history, literature, world economics, etc.
3. Research Academies in Japan There are basically two lines of research institutions in Japan: one under the Ministry of Education and the other under the Science and Technology agencies. University-affiliated research institutes, as well as independent national research institutes with graduate schools, come under the jurisdiction of Ministry of Education. As of 1997, among 587 Japanese universities, 62 had affiliated research institutes of which 20 were open to all researchers in Japan. There are also 14 independent research institutes in Japan, unaffiliated with any university, carrying out major academic research projects. These so-called interuniversity research institutes are set up because of a specific demand to undertake academic research that requires resources and manpower beyond the university boundary. The National Laboratory for High Energy Physics was the first of this kind to be established (1971). The famous National Museum of Ethnology (1974) and the National Institute of Multimedia Education (1997), which aim at scientific research, data collection, and curricula development, have a substantial research complement of their own and are staffed by visiting scholars from abroad as well as local ones. The actual contribution of the interuniversity research institutes lies in the nature of basic research. Large-scale facilities and data resources as well as human resources seconded from universities throughout Japan are considered important mechanisms in enhancing the progress of scientific research in Japan. Other national research institutes, mostly concerned with natural sciences but also including social sciences (such as the noted National Institute of Population and Social Security Research and Economic Research Institute), fall into the domain of Science and Technology agencies. The population and social security research institute was founded in 1996 by combining two government research organizations: the Institute of Population Problems and the Social Development Research Institute. It is now affiliated with the Ministry of Health and Welfare. A research staff of 45 is located in seven research departments. Although policy-oriented research comprises a major proportion of the institute’s work and is self-defined by the staff, academic research is still encouraged through both institutional and individual efforts. Surveys concerning population and social security are carried out to produce primary and secondary data for policy formulation. At the same time, these data also give rise to future academic studies on social and economic issues. There is also a national advisory board on the scientific development of Japan. The Science Council 13713
Scientific Academies in Asia of Japan, attached to the Prime Minister’s office, was established in 1949. Unlike the Japan Academy or academicians and academy members in Taiwan and mainland China, the 210 distinguished scientists from all fields who sit on the board are not given honorary lifetime titles but serve three-year terms of office. The council enjoys great academic prestige and represents Japanese scientists domestically as well as internationally. The council has the right to advise the government to initiate important scientific research programs, and the government may seek professional recommendations from the council as well. The council is also actively engaged in bilateral scientific exchanges and other forms of international participation. The council is also expected to coordinate academic research in Japan and facilitate the implementation of important decisions concerning academic development in Japan. With a few exceptions, such as the Nihon University Population Research Institute, most academies in Japan are national. But the restrictions stemming from their organizational structure (they come under the jurisdiction of various government ministries) may translate into less research freedom or a higher demand for policy-oriented studies. In addition, those academies or research institutes affiliated with universities are usually also expected to carry out teaching functions at the individual researcher’s level.
4. Conclusion This article has briefly outlined the national character of research organizations in Taiwan, mainland China, and Japan. The importance of the government’s role in academic development in this region can be clearly observed. The private sector, in contrast, plays only a minor role, if any, in academic research. However, several differences may be distinguished among the three territories. Taiwan’s Academia Sinica is perhaps foremost in terms of both research autonomy and social services research. Benefiting from the cultural tradition and the expectations of the society in which it functions, researchers there also appear to enjoy more resources in funding and in social prestige. The motivation for research can be stated in purely academic terms and no policy orientation is required in order to receive adequate funding. In addition, Taiwanese scholars have shown a stronger preference for individual research projects. Mainland China, in comparison, launched its modern social science sector in the late 1970s, a time when a few basic disciplines such as sociology were still under suspicion. That contextual factor has certainly introduced an added element to the Chinese academy—the importance of correct political attitudes. As a consequence, research aims are required to be framed within the mainstream ideology and group efforts are more likely to be observed. Japan has a different tradition regarding 13714
academies. Although research autonomy is encouraged and political factors are not necessarily emphasized, research still tends to be applied in nature. This is largely because of structural factors, in that most academies are affiliated with various government ministries or with research institutes that are concerned with policy formulation and teaching as well as pure research. Also, Japanese social scientists are more inclined to participate in group projects headed by a leader in the field. Whether this is consistent with the national character remains to be explored. With the relatively positive outlook for economic growth in the near future, the academies in Taiwan, mainland China, and Japan may experience substantial concomitant development and may thus reach new horizons in certain research fields. Nevertheless, academic collaboration within the region itself is still comparatively rare and may be a focus in the future agenda. Hopefully, a unique Asian perspective may be developed from persistent and extensive social science studies in these areas. See also: Centers for Advanced Study: International\ Interdisciplinary; History of Science; Science Funding: Asia; Scientific Academies, History of; Universities, in the History of the Social Sciences
Bibliography Academia Sinica 1998 Academia Sinica: A Brief Introduction. Academia Sinica, Taipei, Taiwan Brief Description of The Chinese Academy of Social Sciences. The Chinese Academy of Social Sciences, Beijing Brief Description of the Science of Council of Japan. Science Council of Japan, Tokyo Chang S. I 2000 Rationale for an International Graduate School at Academia Sinica. Report presented at the Preliminary Discussion of the Mentoring of Graduate Students to Become Future Intellectual Leaders. Academia Sinica, Taipei, Taiwan Introduction to the National Institute of Population and Social Security Research in Japan. 2000 National Institute of Population and Social Security Research, Tokyo Lee Y-T 1999 The Academic Development of Academia Sinica: Its Present Situation and its Future Vision. The Newsletter of the Legislatie Yuan, The Legislative Yuan, Taipei, Taiwan The Glorious Fifty Years of The Chinese Academy of Sciences: 1949—1999. 1999 The Chinese Academy of Sciences, Science Press The Weekly Newsletter of Academia Sinica 2000 The Personnel Office of Academia Sinica
C.-C. Yi
Scientific Concepts: Development in Children This article examines the development of children’s scientific understanding. It is organized into four
Scientific Concepts: Deelopment in Children sections: initial understanding; development of physical concepts; development of biological concepts; and learning processes.
1. Initial Understanding of Scientific Concepts A large amount of recently obtained evidence indicates that infants’ conceptual understanding is considerably more sophisticated than previously assumed. Traditionally, researchers relied on children’s verbal explanations and\or their actions as measures of their conceptual understanding. These methods often underestimated infants’ and very young children’s understanding, owing to their inarticulateness and poor motor coordination. However, a new method, the violation-of-expectation paradigm, has made it possible to assess infants’ physical knowledge by examining how long they look at ‘possible’ and ‘impossible’ events. A typical experiment involves habituating the child to a series of physically possible events and then presenting either a different possible event or an impossible event. The assumption is that children who understand the impossibility of the one event will look longer at it, because they are surprised to see the violation of the principle. Studies using this violation of expectation paradigm have revealed impressive initial understanding of physical concepts. Even infants possess certain core concepts and understand several principles that govern the mechanical movement of objects (Spelke and Newport 1998). For example, 4-month-olds have a notion that one solid object cannot move through the space occupied by another solid object. In the studies that demonstrated this phenomenon, infants were first habituated to a display in which a ball was dropped behind a screen, which was then removed to reveal the ball on the floor. They then saw two events. In the consistent condition, the ball was again dropped, and when the screen was removed, infants saw the ball resting on a platform above the stage floor. In the inconsistent event, the ball was resting on the stage floor under the platform. Infants looked longer at the inconsistent event than at the consistent one, as if they were surprised to see the violation of the object–solidity principle. Other studies with a similar approach have also revealed infants’ understanding of gravity and other physical regularities. Infants appear to understand that an unsupported object should move downward and that objects do not ordinarily move without any external force being applied. Infants’ knowledge of physical regularities is gradually refined over the first year, as demonstrated by their understanding of collisions. Most 3-month-olds appear surprised to see a stationary object move when not hit by another object. Six-month-olds can appreciate how the features of objects affect a collision. They appeared to be surprised when an object moves further following a collision with a small moving object than it does after colliding with a larger moving
object. Later during the first year, infants responded differently to events in which the object that was struck moved at an angle perpendicular to the motion of the object that struck it than when it moved in a more standard path. Other researchers, however, raise concerns about the use of such paradigms to draw inferences about infants’ conceptual knowledge (e.g., Haith and Benson 1998). They argue that differential looking only indicates that infants discriminate between two events and that perceptual rather than conceptual features might drive this discrimination. Thus, infants’ visual preference or looking time might have been incorrectly interpreted as evidence of an appreciation of physical principles. Early conceptual understanding is not limited to understanding of physics principles. By age 3 years, children can distinguish living from nonliving things. They recognize that self-produced movement is unique to animals. In one study that made this point, preschoolers were shown brief videotapes in which animals or inanimate artifacts moved across the screen. Then the children were asked to make judgments about internal causes (does something inside this object make it move) and external causes (did a person make this move). Children typically attributed the cause of the animate object’s motion to internal features (‘it moves itself;’ ‘something inside makes it move’). In contrast, they were more likely to attribute the motion of an artifact to an external agent (Gelman 1990). Preschoolers are also sensitive to differences in the ‘types of stuff’ inside animate and inanimate objects. They draw inferences about identity and capacity to function based on internal parts, associating, for example, internal organs, bones, and blood with animals and ‘hard stuff’ or ‘nothing’ with inanimate objects. After hearing a story about a skunk that was surgically altered so that it looked like a raccoon, young children reported that the animal was still a skunk, despite its altered appearance. However, children did not reason in the same way when they heard similar stories about artifacts; a key that was melted down and stamped into pennies was no longer a key (Keil 1989).
2. Deeloping Understanding of Physical Concepts Rudimentary understanding of basic concepts does not imply full-blown appreciation of physical principles. Even older children’s concepts and theories often involve substantial misconceptions. Understanding of physical concepts undergoes substantial change with age and experience. One good example involves physical causality and mechanical movement. When Event A precedes Event B, many 3- and 4-yearolds fail to choose A consistently as the cause, whereas 13715
Scientific Concepts: Deelopment in Children 5- and 6-year-olds are considerably more likely to choose A. Children also hold intuitive theories of motion that are inconsistent with fundamental mechanical principles. For example, when asked to predict how a ball would travel after rolling through a spiral tube, only one-fourth of 9-year-olds and less than half of 11-year-olds correctly predicted the ball’s trajectory. Misconceptions also occur with other concepts. For example, children’s conceptions of matter, weight, volume, and density undergo substantial change with age. Most 3-year-olds have undifferentiated notions of the roles of density, weight, and volume in producing buoyancy of objects placed in liquids. Most 4- and 5year-olds have some conception of density, although their judgments are also affected by other features of the objects (i.e., weight and volume). Eight- and 9year-olds, in contrast, rely consistently on density in judging whether objects will sink or float. On some physical tasks, children’s increasing understanding can be characterized as a series of increasingly adequate rules. One such task is Siegler’s (1976) balance scale task. On each side of the scale’s fulcrum were four pegs on which metal weights could be placed. In each trial, children were shown a configuration of weights on pegs and were asked to predict whether the scale would balance or whether one side would go down after release of a lever that held the scale motionless. Most children based their predictions on one of four rules. The large majority of 5-year-olds relied solely on weight (Rule I). This involved predicting that the scale would balance if both sides had the same amount of weight and that the side with more weight would go down if the two sides had different amounts of weight. Nine-year-olds often used Rule II. This involved predicting that the side with more weight would go down when one side had more weight, but predicting that the side with its weight further from the fulcrum would go down when weights on the two sides were equal. Some 9-year-olds and most 13–17-yearolds used Rule III. They considered both weight and distance on all problems, and predicted correctly when weights, distances, or both were equal for the two sides. However, when one side had more weight and the other side’s weights were further from the fulcrum, children muddled through, not relying consistently on any identifiable approach. Rule IV allowed children to solve all balance scale problems. It involved choosing the side with greater torque (WLDL vs. WRDR) when one side had more weight (W) and the other had its weight further from the fulcrum (D). Few children or adults used Rule IV. Similar sequences of rules have been shown to characterize the development of a variety of tasks, including projection of shadow, probability, water displacement, conservation of liquid and solid quantity, and time, speed, and distance. A related way of conceptualizing the development of scientific understanding is as a succession of 13716
increasingly adequate mental models. One good example of such a succession involves understanding of the Earth as an astronomical object (Vosniadou and Brewer 1992). Some children, particularly young ones, conceive of the Earth as a flat, solid, rectangular shape. A slightly more sophisticated mental model is to think of the Earth as a disk. Three yet more advanced incorrect approaches are the ‘dual Earth model,’ which includes a flat, disk-like Earth where people live and a spherical Earth up in the sky; the ‘hollow sphere model,’ in which people live on flat ground inside a hollow sphere; and the ‘flat sphere model,’ in which, people live on flat ground on top of a hollow sphere. All three of these models allow children to reconcile their perception that the Earth looks flat with their teachers’ and textbooks’ insistence that the Earth is round. The proportion of children who possess the correct ‘spherical model’ of the Earth increases from 15 percent to 40 percent to 60 percent from first to third to fifth grade.
3. Deeloping Understanding of Biological Concepts Young children have a concept of living things, but it does not perfectly match the concept of older children and adults. Until about age 7 years, most children do not view plants as living things. In addition, 3-yearolds fairly often attribute life to powerful, complex, or moving inanimate objects, such as robots and the Sun. A similar mix of understandings and misunderstandings is evident in preschoolers’ views regarding internal parts of living things. They know that animals have bones and hearts, but have little idea of their functions. Children’s understanding of other uniquely biological concepts, such as growth, inheritance, and illness, also undergoes substantial change with age and experience. Preschool children have some appreciation of biological growth. They expect animals to grow, appreciate that growth can only occur in living things, and understand that growth is directional (small to big). However, preschoolers also believe that living things may or may not grow, and have difficulty accepting that even small things, such as a worm or butterfly, grow. Not until age 5 or 6 years do children realize the inevitability of growth; one cannot keep a baby pet small and cute just because one wants to do so. Children’s understanding of inheritance, another uniquely biological process, also develops with age and experience. Preschoolers understand that like begets like: Dogs have baby dogs, rabbits have baby rabbits, and offspring generally share biological properties with parents (Wellman and Gelman 1998). They also believe that animals of the same family share physical features even when they are raised in different environments. For example, preschoolers believe that
Scientific Concepts: Deelopment in Children a rabbit raised by monkeys would still prefer carrots to bananas. However, other studies suggest that not until 7 years of age do children understand birth as part of a process mediating the acquisition of physical traits and nurturance as mediating the acquisition of beliefs. Only older children clearly distinguish between properties likely to be affected by heredity and properties likely to be affected by environment. For example, not until school age do children expect a boy to resemble his biological father in appearance but to resemble his adoptive father in beliefs. Thus, younger children seem to have different intuitions about the mechanisms of inheritance than do older children and adults. Even preschoolers show some understanding of illness, yet another biological process. For example, preschoolers have a notion that an entity can induce illness or be contaminated and that contamination may occur through the workings of invisible, physical particles. Four- and 5-year-olds have some understanding of contagion; they believe that a child is more likely to get sick from exposure to another person who caught the symptom by playing with a sick friend than from another person who developed the symptom through other means. Although preschoolers may have a general idea that germs can cause symptoms, they do not differentiate the effects of symptoms caused by germs from those caused by poison, for example. The mature concept of illness, which is characterized as uniting various components such as its acquisition, symptoms, treatment, and transmission, is not mastered until much later (Solomon and Cassimatis 1999). As with physical concepts, children’s understanding of biological concepts involves substantial developmental change. For some concepts, changes involve enrichment, as children learn more and more details and phenomena relevant to the concepts. For others, developmental changes involve radical conceptual reorganization. There are some interesting parallels between the redefining and restructuring involved in the history of scientific understanding and the changes that occur within an individual lifetime (Carey 1985).
4. Learning Processes Acquisition of scientific understanding involves the discovery of new rules and concepts through direct experience, as well as through instruction. Children’s misconceptions can be overcome through experience that contradicts them. Only recently, however, have researchers directly examined the learning processes involved in the acquisition of scientific concepts. One approach that has proved particularly useful for learning about changing understanding of scientific concepts is the microgenetic method. This approach involves observing changing performance on a trialby-trial basis, usually as children gain experience that
promotes rapid change. Thus, the approach yields the type of high-density data needed to understand change processes (Siegler and Crowley 1991). One example of the usefulness of the approach is provided by Siegler and Chen’s (1998) study of preschoolers’ learning about balance scales. The children were presented problems in which the two sides of the scale had the same amount of weight, but one side’s weight was further from the fulcrum. The goal was to see if children acquired Rule II, which correctly solves such problems, as well as problems on which weight on the two sides varies but distance does not. Children’s rules were assessed in a pretest and posttest in which children were asked to predict which side of the scale would go down or whether it would remain balanced. In the feedback phase between the pretest and post-test, children were repeatedly asked to predict which side of the balance, if either, would go down if a lever that held the arm motionless was released; then the lever was released and the child observed the scale’s movement; and then the child was asked to explain the outcome they had observed. The trial-by-trial analysis of changes in children’s predictions and explanations allowed the examination of the learning processes involved in rule acquisition. Four learning processes were identified. The first component of learning involves noticing potential explanatory variables (e.g., the role of distance) which previously had been ignored. The second step involves formulating a rule that incorporated distance as well as weight. To be classified as formulating a rule, children needed to explain the scale’s action in one trial by stating that a given side went down because its disks were further from the fulcrum, and then in the next trial to predict that the side with its disks further from the fulcrum would go down. The third component involves generalizing the rule to novel problems by using it in most trials after it was formulated. Finally, the last component involves maintaining the new rule under less facilitative circumstances, by using the new rule in the posttest, where no feedback was given. The componential analysis proved useful for understanding learning in general and also developmental differences in learning. The key variable for learning of both older and younger children, and the largest source of developmental differences in learning, involved the first component, noticing the potential explanatory role of distance from the fulcrum. Most 5year-olds noticed the potential role of distance during learning, whereas most 4-year-olds did not. Children of both ages who noticed the role of distance showed high degrees of learning; those of both ages who did not, did not. The same componential analysis of children’s learning of scientific concepts has proved useful in examining children’s learning about water displacement and seems applicable to many other concepts also. Although children often discover new rules, modify their mental models, or acquire new concepts through 13717
Scientific Concepts: Deelopment in Children direct experience both in physical and biological domains, direct observation of the natural world is often inadequate for learning new concepts. Indeed, daily experience sometimes hinders children’s understanding. For example, children’s misconceptions about the shape and motion of the Earth might result from the fact that the world looks flat. Well planned instruction is essential for helping children to overcome these misconceptions and to gain more advanced understanding. The effects of instruction on children’s scientific understanding can be illustrated by a study of the acquisition of the variable control principle (Chen and Klahr 1999). The variable control principle involves manipulating only one variable at a time so that an unconfounded experiment can be conducted and valid inferences can be made about the results. Most early elementary school children do not discover the variable control concept on their own. This observation led Chen and Klahr (1999) to test whether second, third and fourth graders could learn the concept through carefully planned instruction. Children were asked to design experiments to test the possible effects of different variables (whether the diameter of a spring affects how far it stretches, whether the shape of an object affects the speed with which it sinks in water, etc.). An unconfounded design would contrast springs that differed only in diameter but not in length, for example. Providing direct instruction proved to be effective in acquiring the control of variables concept. Both older and younger children who received instruction in designing tests in a specific task were able to understand the rationale and to apply the principle to other tasks. However, the older children were more able to extend the principle to novel contexts. When receiving training in designing tests involving the diameter of a spring, for example, second graders were able to apply the concept only to testing other variables involving springs (e.g., wire size). Third graders used the principle in designing experiments involving other mechanical tasks, such the speed with which objects sank in water. Only fourth graders, however, were able to apply the principle to remote contexts, such as experiments on the causes of plant growth. In summary, recent research has revealed that infants, toddlers, and preschoolers have considerably greater scientific knowledge than previously recognized. However, their knowledge is far from complete, though. Developmental changes in children’s scientific understanding involve both enrichment and structural reorganization. Older children possess more accurate and coherent rules and mental models. These understandings arise, at least in part, from their richer experience, their more advanced abilities to encode and interpret that experience, and their superior ability to separate their theories from the data (Kuhn et al. 1995). Older children also generalize the lessons from instruction more effectively than do younger children. Although there is general agreement that early 13718
understanding of scientific concepts is surprisingly strong, there are disagreements about exactly what knowledge or concepts to attribute to infants, toddlers, and preschoolers. Even less is known about how children progress from their initial understanding of scientific concepts to more advanced understanding. Addressing the issue of how change occurs remains a major challenge, as well as a fruitful direction for future research. See also: Cognitive Development in Childhood and Adolescence; Cognitive Development in Infancy: Neural Mechanisms; Concept Learning and Representation: Models; Early Concept Learning in Children; Infant Development: Physical and Social Cognition; Piaget’s Theory of Child Development; Scientific Reasoning and Discovery, Cognitive Psychology of
Bibliography Carey S 1985 Conceptual Change in Childhood. MIT Press, Cambridge, MA Chen Z, Klahr D 1999 All other things being equal: acquisition and transfer of the control of variables strategy. Child Deelopment 70: 1098–120 Gelman R 1990 First principles organize attention to and learning about relevant data; number and the animate–inanimate distinction as examples. Cognitie Science 14: 79–106 Haith M, Benson J 1998 Infant cognition. In: Damon W, Kuhn D, Siegler R S (eds.) Handbook of Child Psychology. Vol. 2. Cognition, Perception and Language, 5th edn. J. Wiley, New York, pp. 199–254 Keil F C 1989 Concepts, Kinds, and Cognitie Deelopment. MIT Press, Cambridge, MA Kuhn D, Garcia-Mila M, Zohar A, Andersen C 1995 Strategies of knowledge acquisition. Monographs of the Society for Research in Child Deelopment 60: 1–127 Siegler R S 1976 Three aspects of cognitive development. Cognitie Psychology 8: 481–520 Siegler R S, Chen Z 1998 Developmental differences in rule learning: a microgenetic analysis. Cognitie Psychology 36: 273–310 Siegler R S, Crowley K 1991 The microgenetic method: a direct means for studying cognitive development. American Psychologist 46: 606–20 Solomon G E A, Cassimatis N L 1999 On facts and conceptual systems: young children’s integration of their understandings of germs and contagion. Deelopmental Psychology 35: 113–26 Spelke E S, Newport E L 1998 Nativism, empiricism, and the development of knowledge. In: Damon W, Lerner R M (eds.) Handbook of Child Psychology. Vol. 1. Theoretical Models of Human Deelopment, 5th edn. J. Wiley, New York, pp. 275–340 Vosniadou S, Brewer W F 1992 Mental models of the earth: a study of conceptual change in childhood. Cognitie Psychology 24: 535–85
Scientific Controersies Wellman H M, Gelman S A 1998 Knowledge acquisition in foundational domains. In: Damon W, Kuhn D, Siegler R S (eds.) Handbook of Child Psychology. Vol. 2. Cognition, Perception and Language, 5th edn. J. Wiley, New York, pp. 575–630
Z. Chen and R. Siegler
Scientific Controversies Science in general can be an object of controversy such as in disputes between science and religion. Particular scientific findings can also generate controversies either within or outside science. The importance of scientific controversy has been recognized by scholarship within science and technology studies (S&TS) since the 1970s. Indeed the study of controversies has become an important methodological tool to gain insight into key processes that are not normally visible within the sciences. What makes something a scientific controversy? It is important to distinguish longstanding disputes, such as that between science and religion, or the merits of sociobiological explanation as applied to humans, or whether the fundamental constituents of matter are particles or waves, from more localized disputes such as over the existence of a new particle or a new disease transmitting entity. The latter sorts of controversy are more like a ‘hot spot’ that erupts for a while on the surface of science than a deeply entrenched longrunning battle. Also controversies are not be confused with the bigger sea changes which science sometimes undergoes during scientific revolutions. Although defining a scientific revolution is itself contested, the all-pervasive nature of the changes in physics brought about by quantum mechanics and relativity seems different from, for example, the controversy over the detection of large fluxes of gravitational radiation or over the warped zipper model of DNA. Similarly, long-running debates on the relative impacts of nature and nurture on human behavior have a different character from more episodic controversies, such as the possibility of interspecies transfer of prion disease. Of course, intractable disputes and revolutions share some of the features associated with controversies, but it is the bounded nature of controversies which has led to their becoming an object of study in their own right, especially within the tradition of S&TS associated with the sociology of scientific knowledge (SSK). One metaphor for understanding why controversies have taken on such methodological importance is that of ‘punching’ a system. On occasions scientists gain insight into natural systems by punching, or destabilizing, them. For example, one may learn more about the laws of momentum by bouncing one billiard ball off another than by watching a stationary billiard ball.
Similarly Rutherford famously used scattering experiments in which gold foil was bombarded with alpha particles to uncover the structure of the atom and in particular the presence of the nucleus. The methodological assumption underpinning the study of controversies is similar. By studying a scientific controversy, one learns something about the underlying dynamics of science and technology and their relations with wider society. For instance, during a controversy the normally hidden social dimensions of science may become more explicit. Sites of contestation are places to facilitate the investigation of the metaphors, assumptions, and political struggles embedded within science and technology. We can note four different influential approaches towards the study of scientific controversies. The school of sociological research associated with Robert Merton (1957) first recognized the importance of controversies within science. Of particular interest to Merton was the existence of priority disputes. Many well-known controversies center on who is the first scientist to make a particular scientific discovery. A second approach toward the study of scientific controversy developed in the 1960s as concerned citizens increasingly protested what they took to be the negative effects of science and technology. Here the source of controversy is the perceived negative impact of science and technology on particular groups and it is the study of these political responses that forms the core of the analysis. The new SSK which emerged in the 1970s and which largely displaced the Mertonian School provides a third approach towards the study of controversies. Here the focus is on controversies at the research frontiers of science where typically some experimental or theoretical claim is disputed within an expert community. Modern S&TS owe a heavy debt to SSK but are less likely to make distinctions between the content of science and its impact. Within this fourth approach, controversies are seen as integral to many features of scientific and technological practice and dissemination. Their study forms a key area of the discipline today.
1. Merton and Priority Disputes Merton’s interest in priority disputes stemmed from his claim that science has a particular normative structure or ‘institutional ethos’ with an accompanying set of rewards and sanctions. Because so much of the reward structure of science is built upon the recognition of new discoveries, scientists are particularly concerned to establish the priority of their findings. Such priority disputes are legion, such as the famous fight between Newton and Leibnitz over who first discovered the calculus. It was Thomas Kuhn (1962) who first raised a fundamental problem for the analysis of priority 13719
Scientific Controersies disputes. A priority dispute is predicated upon a model of science, later known as the ‘point model’ of scientific discovery, which can establish unambiguously who discovered what and when. Asking the question of who discovered oxygen, Kuhn showed that the crucial issue is what counts as oxygen. If it is the dephlogisticated air first analyzed by Priestly then the discovery goes to him, but if it is oxygen as understood within the modern meaning of atomic weights then the discovery must be granted to Lavoisier’s later identification. The ‘point model’ requires discovery to be instantaneous, and for discoveries to be recognized and dated. A rival ‘attributional model’ of discovery, first developed by Augustin Brannigan (1981), draws attention to the social processes by which scientific discoveries are recognized and ‘attributed.’ This approach seems to make better sense of the fact that what counts as a discovery can vary over time. In short, it questions the Eureka moment of the point model. For example, Woolgar (1976), in his analysis of the pulsar’s discovery, shows that the date of the discovery varies depending on what stage in the process is taken to be the defining point of the discovery. If the discovery is the first appearance of ‘scruff’ on Jocelyn Bell’s chart recording of signals from the radio telescope, then it will be dated earlier than when it was realized that the unambiguous source of this ‘scruff’ was a star. This case was particularly controversial because it was alleged by the dissonant Cambridge radio astronomer Fred Hoyle that the Nobel Prize winners for this discovery should have included Jocelyn Bell, who was then a graduate student. Priority disputes can touch in this way on the social fabric of science, such as its gender relationships and hierarchical structure. Despite the challenge posed by the attributional model, it is the point model of discovery that is embedded in the reward system of science. As a result, priority disputes still abound. In modern technoscience, discovery can mean not only recognition, but also considerable financial reward, as for example with patents, licensing arrangements, or stock in a biotech company. In such circumstances, priority disputes have added salience. One has only to think of the unseemly battle between Robert Gallo and the Pasteur Institute over priority in the discovery that HIV is the cause of AIDS. In this case, there was not only scientific priority at stake, but also the licensing of the lucrative blood test for identifying AIDS. The controversy could only be settled by intervention at the highest political level. The Presidents of the USA and France, Ronald Reagan and Jacques Chirac, agreed to share the proceeds from the discovery. Again what was at stake scientifically was not simply who was first past the post; the protagonists initially claimed to have isolated different retroviruses and disagreed over the effectiveness of the various blood tests. This case was marked by additional controversy because of allega13720
tions of scientific misconduct raised against Gallo that led to Congressional and National Institute of Health (NIH) investigations.
2. Controersy oer the Impact of Science and Technology That a priority dispute could require the intervention of national political leaders is an indication of just how important science and technology can become for the wider polity. In response to the AIDS crisis, activist groups have campaigned and pressured scientists and government officials to do more scientifically. They have also intervened in matters of research design, such as the best way to run clinically controlled trials. Such activist engagement dates back to the political protests that science and technology generated in the 1960s in the context of Vietnam-era issues such as war and environmentalism. There has been increasing recognition that science and technology are neither neutral nor necessarily beneficial and that many developments stemming from modern science and technology, such as nuclear power, petrochemical industries, and genetic engineering, raise profound and controversial issues for a concerned citizenry. Dorothy Nelkin, a pioneer in analyzing these types of disputes identified four types of political, economic, and ethical controversies that engage the public in the US (Nelkin 1995). One set revolves around the social, moral, and religious impact of science. Issues such as the teaching of evolution in US schools, animal rights, and the use of fetal tissue fall into this first category. A second type of controversy concerns a clash between the commercial and economic values surrounding science and technology and that of the environmental movement. Ozone depletion, toxic waste dumps, and greenhouse gases are pertinent examples. A third set has been provoked by health hazards arising from the transformation of food and agricultural practices by the use of modern science and technology. Genetically modified foods, the carcinogenic risks posed by food additives, and the use of bovine growth hormones in the dairy industry all belong in this category. A fourth group centers on conflicts between individual rights and group rights: a conflict that has been heightened by new developments in science and technology. For example, the mass fluoridation of water to improve dental health denies individuals the right to choose for themselves whether they want fluoride in their water supply. Research on these sorts of controversies has focused mainly on the interest politics of the groups involved. How and why do they get involved in political action over science and technology; what underlying political values do such groups exhibit; and how do they effectively intervene to protest some perceived deleterious development stemming from science, technology, or medicine? The positions taken by the
Scientific Controersies participants are consistent with their interests, although these interests may not enable the outcome or closure of a debate to be predicted. For instance, the demise of nuclear power had as much to do with economics as with political protest. Since scientists themselves often play an active part in these disputes, a full analysis will touch upon how scientists deploy their science for political aims. But, by and large, this research tradition has avoided using the entry of scientists into these disputes to examine the core processes by which scientific knowledge is developed and certified. In short, the attention was focused upon seeing how scientists became political rather than upon how politics might itself shape scientific knowledge. Political controversies were treated as analytically separable from epistemic controversies and as resolved by distinct processes of closure (Engelhardt and Caplan 1987). Typically, epistemic controversies were thought to be closed by application of epistemic and methodological standards, while political controversies were closed through the intervention of ‘non-scientific factors,’ such as economic and political interests.
3. Scientific Controersy and the Sociology of Scientific Knowledge With the emergence of the SSK in the late 1970s, it was no longer possible to avoid examining how scientific knowledge was shaped and how this shaping contributed to the dynamics of controversies. A key tenet of this new sociology of science, as formulated by David Bloor (1991) in his ‘Strong Programme,’ was that of symmetry. This principle called upon sociologists to use the same explanatory resources to explain both successful and unsuccessful knowledge claims. It raised to methodological status the necessity of examining the processes by which science distinguishes the wheat of truth from the chaff of error. SSK soon turned its attention towards examining scientific controversies because it is during such controversies that this symmetry principle can be applied to good effect. With each side alleging that it has ‘truth’ on its side, and disparaging the theoretical and experimental efforts of the other, a symmetrical analysis can explain both sides of the controversy using the same sorts of sociological resources. This differs from the earlier interest approach to controversies in that it applies this symmetrical sociological analysis to the very scientific claims made by the participants. Bloor and his colleagues of the Edinburgh school pursued their program mainly through theoretical analysis supported by historical case studies. H. M. Collins and the ‘Bath School’ by contrast, developed an empirical method for studying the SSK in contemporaneous cases: a method based primarily upon the study of scientific controversies. One early ap-
plication of the method was to the study of parapsychology (Collins and Pinch 1982). Collins and Pinch suggested that controversies such as that provoked by parapsychology were resolved by boundary crossing between two different forums of scientific debate: the constitutive and the contingent. Generalizing from several case studies of controversies, Collins (1981) argued that during controversies scientific findings exhibited ‘interpretative flexibility,’ with the facts at stake being debated and interpreted in radically different ways by the parties in the controversy. This interpretative flexibility did not last forever: by following a controversy over time, researchers could delineate the process of ‘closure’ by which controversy vanished and consensus emerged. Collins defined the group of scientists involved in a controversy as the ‘core set.’ Only a very limited set of scientists actively partook in controversies; the rest of the scientific community depended upon the core set for their expert judgment as to what to believe. This was particularly well illustrated by Martin Rudwick (1985) in his study of the great Devonian Controversy in the history of geology. As researchers followed controversies from their inception to the point of closure, it became necessary to address matters of scientific method as they were faced in practice by the participants. Factors that had usually been seen as issues of method or epistemology thus became open to sociological investigation: for example, the replication of experiments, the role of crucial experiments, proofs, calibration, statistics, and theory. In addition, other factors such as reputation, rhetoric, and funding were shown to play a role in the dynamics of controversies. An important finding of this research was what Collins (1992) called the ‘experimenter’s regress.’ Controversies clearly were messy things and were very rarely resolved by experiments alone. Collins argued that in more routine science experiments were definitive because there was an agreed-upon outcome which scientists could use as a way of judging which scientists were the competent practitioners. If one could get one’s experiment to work, one had the requisite skills and competence; if one failed, one lacked the skills and competence. The trouble was that when there was a dispute at the research frontiers there was no agreed upon outcome by which to judge the competent practitioners. Experiments had to be built to investigate a claimed new phenomenon, but failure to find the new phenomenon might mean either there was no new phenomenon to be found or that the experimenter failing to find it was incompetent. This regress was only broken as a practical matter by the operation of a combination of factors such as rhetoric, funding, and prior theoretical dispositions. Often the losing side in a scientific controversy continues to fight for its position long after the majority consensus has turned against it. Those who continue will meet increasing disapprobation from 13721
Scientific Controersies their colleagues and may be forced to leave science altogether. ‘Life after death’ goes on at the margins and often finally passes away only when the protagonists themselves die or retire (Simon 1999). The uncertain side of science is clearest during moments of controversy. Most scientists never experience controversies directly, and often it is only after exposure to a controversy that scientists become aware of the social side of science, start reading in science studies, and even employ ideas drawn from science studies to understand what has happened to them. This work on scientific controversy has been exemplified by a number of case studies of modern science such as memory transfer, solar neutrinos, gravity waves, high-energy physics, and famously cold fusion. Historians have also taken up the approach used by sociologists and the sociological methods have been extended to a number of historical case studies. Such studies pose particular methodological challenges because often the losing viewpoint has vanished from history. Shapin and Schaffer’s (1985) study of the dispute between Robert Boyle and Thomas Hobbes over Boyle’s air pump experiments was a landmark in research on scientific controversy, because it showed in a compelling way how the wider political climate, in this case that of Restoration Britain, could shape the outcome of a controversy. It showed how that climate could help institutionalize a new way of fact-making experiments in the Royal Society at the same time. In addition, it drew attention to the literary and technological dimensions of building factual assent in science. By documenting the witnesses to particular experimental performances, a culture of ‘virtual witnessing’ was born. The SSK approach to scientific controversy has also been influential in the study of technology. The social construction of technology (SCOT) framework uses concepts imported from the study of scientific controversy such as ‘interpretative flexibility’ and ‘closure.’ A variety of competing meanings are found in technological artifacts and scholars study how ‘closure mechanisms’ such as advertising produce a stable meaning of a technology (Pinch and Bijker 1987, Bijker 1995). Another influential approach to the study of controversies in sciences and technology has been that developed by Bruno Latour and Michel Callon. Again, the initial impetus came from studies of scientists. Callon’s (1986) article on a controversy over a new method of harvesting scallops is one of the first articulations of what later became known as Actor Network Theory (ANT). Callon argues that the outcome of a controversy cannot be explained by reference to the social realm alone, but the analyst must also take account of the actions of non-human actors, such as scallops, which play a part in shaping the outcome. Subsequently Latour’s work on how ‘trials of strength’ are settled in science and technology has become especially influential within the new SSK. 13722
Such struggles, according to Latour (1987), involve aligning material and cognitive resources with social ones into so-called ‘immutable mobiles’ or black boxes, objects which remained fixed when transported along scientific networks and which contain embedded within them sets of social, cognitive, and material relationships. Latour and Woolgar (1979) in their now classic study of a molecular biology laboratory showed that literary inscriptions play a special role in science. They indicated how controversies could be analyzed in terms of whether certain modalities are added to or subtracted from scientific statements making them more or less fact-like. The role of discourse in scientific controversies has been examined in great depth in a study of the oxidative phosphorylation controversy by Gilbert and Mulkay (1984). They showed how particular repertoires of discourse, such as the ‘empiricist repertoire’ and the ‘contingent repertoire,’ are used selectively by scientists in order to bolster their own claims or undermine those of their opponents. Subsequently there has been much work on how a variety of rhetorical and textual resources operate during controversies (e.g., Myers 1990). Sometimes the resolution of controversy is only possible by drawing boundaries around the relevant experts who can play a role in the controversy. Sometimes particular scientific objects cross such boundaries and form a nexus around which a controversy can be resolved. Such ‘boundary work’ (Gieryn 1983) and ‘boundary objects’ (Star and Griesemer 1989) form an important analytical resource for understanding how controversies end. In addition to analyzing scientific controversies, SSK has itself become a site of controversy. Most notably, lively controversies have occurred over the viability of interest explanations, over the extent to which the sociology of science should itself be reflexive about its methods and practices, and over the role of non-human actors. The ‘science war’ involving debates between natural scientists and people in science studies over the methods and assumptions of science studies and cultural studies of science is another area of controversy that is ripe for sociological investigation.
4. Scientific Controersy in Science and Technology Studies Today In contemporary S&TS, the sites of contestation chosen for analysis have become more heterogeneous. One strength of the new discipline of S&TS is the wide terrain of activities involving science and technology that it examines. For example, similar methods can be used to examine controversies involving science and technology in the courtroom, the media, quasi-governmental policy organizations, and citizens’ action groups. Indeed, many of the most contentious political issues facing governments and citizens today involve
Scientific Controersies science and technology: issues such as genetically modified foods, gene therapy, and in itro fertilization. The study of controversies in modern technoscience—with its porous boundaries between science, technology, politics, the media, and the citizenry—also calls for the analyst to broaden the array of analytical tools employed. Although the fundamental insights produced by SSK remain influential, such insights are supplemented by an increased understanding of how macro-political structures such as the state and the legal system enable and constrain the outcome of scientific controversies. Examples of this sort of work include: Jasanoff’s (1990) investigations of how technical controversies are dealt with by US agencies such as the Environmental Protection Agency (EPA) and the Food and Drug Administration (FDA), Lynch and Jasanoff’s (1998) work on science in legal settings such as the use of DNA evidence in courtrooms, and Epstein’s (1996) work on the AIDS controversy. In the latter case Epstein deals not only with the dispute about the science of AIDS causation, but also turns to social movements research to understand how AIDS activists outside of science got sufficient influence actually to affect the design of clinical trials by which new AIDS drugs are tested. Particularly interesting methodological issues have been raised by the study of controversies that overtly impinge upon politics. When studying controversies within science SSK researchers were largely able to adopt the neutral stance embodied in the symmetry principle of the Strong Programme (see Epigenetic Inheritance). However, some scholars have argued that, when dealing with cases where analysis could have a direct impact upon society, it is much harder to maintain neutrality. Researchers studying these sorts of disputes, such as whether Vitamin C is an effective cancer cure, find they can become ‘captured’ by the people they are studying. This complicates the possibility of producing the sort of neutral analysis sought after in SSK. A number of solutions have been proposed to this dilemma (see Ashmore and Richards 1996). Several authors have attempted to produce typologies of scientific controversies (Engelhardt and Caplan 1987, Dascal 1998). Unfortunately, such typologies are not as useful as they could be because they are confounded by their underlying epistemological assumptions. For example, within an SSK approach it makes little sense to work with a category of, say, ‘sound argument,’ for closing a controversy because in SSK ‘sound argument’ is seen as part of the controversy. What counts as a ‘sound argument’ can be contested by both sides. Martin and Richards in their (1995) review adopt a fourfold typology. This review is particularly useful because it distinguishes between the different types of epistemological assumptions underlying the different analytical frameworks em-
ployed: namely, positivist, group politics, SSK, and social structural. Traditional history and the philosophy of science according to one account (Dascal 1998) are becoming more cognizant of the phenomenon of scientific controversy. But the call to examine scientific controversy by historians and philosophers makes strange reading to scholars immersed in S&TS. It is as if the historians and philosophers of controversy simply have ignored or failed to read the relevant literatures. Thus neither Nelkin’s (1992) influential volume, Controersies nor a special edition of Social Studies of Science edited by H. M. Collins (1981), Knowledge and Controersy, which sets out the SSK approach towards scientific controversy, are referenced. That the best way to study scientific controversy is still controversial within the academy could scarcely be more obvious. See also: Epigenetic Inheritance
Bibliography Ashmore M, Richards E 1996 The politics of SSK: Neutrality, commitment and beyond. Special issue of Social Studies of Science 26: 219–445 Bijker W 1995 Of Bicycles, Bakelites, and Bulbs: Towards a Theory of Sociotechnical Change. MIT Press, Cambridge, MA Bloor D 1991 Knowledge and Social Imagery, 2nd edn. University of Chicago Press, Chicago Brannigan A 1981 The Social Basis of Scientific Discoeries. Cambridge University Press, Cambridge, UK Callon M 1986 Some elements of a sociology of translation: Domestication of the scallops and the fishermen of St Brieux Bay. In: Law J (ed.) Power, Action and Belief: A New Sociology of Knowledge? Sociological Review Monograph, Routledge, London, pp. 196–229 Collins H M 1981 Knowledge and controversy: Studies in modern natural science. Special issue of Social Studies of Science 11: 1 Collins H M 1992 Changing Order, 2nd edn. University of Chicago Press, Chicago Collins H M, Pinch T J 1982 The construction of the paranormal, nothing unscientific is happening. In: Wallis R (ed.) On the Margins of Science: The Social Construction of Rejected Knowledge. Sociological Review Monograph, University of Keele, Keele, UK Dascal M 1998 The study of controversies and the theory and history of science. Science in Context 11: 147–55 Engelhardt T H, Caplan A L 1987 Scientific Controersies: Case Studies in the Resolution and Closure of Disputes in Science and Technology. Cambridge University Press, Cambridge, UK Epstein S 1996 Impure Science: AIDS, Actiism and the Politics of Knowledge. University of California Press, Berkeley, CA Gieryn T 1983 Boundary work and the demarcation of science from non-science: Strains and interests in professional ideologies of scientists. American Sociological Reiew 48: 781–95 Gilbert N G, Mulkay M K 1984 Opening Pandora’s Box. Cambridge University Press, Cambridge, UK Jasanoff S 1990 The Fifth Branch: Science Adisors as Policy Makers. Harvard University Press, Cambridge, MA
13723
Scientific Controersies Kuhn T S 1962 The Structure of Scientific Reolutions. University of Chicago Press, Chicago Latour B 1987 Science in Action. Harvard University Press, Cambridge, MA Latour B, Woolgar S W 1979 Laboratory Life. Sage, London Lynch M, Jasanoff S (eds.) 1998 Contested identities: Science, law and forensic practice. Special issue of Social Studies of Science 28: 5–6 Martin B, Richards E 1995 Scientific knowledge, controversy, and public decision making. In: Jasanoff S, Markle G E, Petersen J C, Pinch T (eds.) Handbook of Science and Technology Studies. Sage, Thousand Oaks, CA Merton R K 1957 Priorities in scientific discoveries: A chapter in the sociology of science. American Sociological Reiew 22: 635–59 Myers G 1990 Writing Biology: Texts and the Social Construction of Scientific Knowledge. University of Wisconsin Press, Madison, WI Nelkin D (ed.) 1992 Controersies: Politics of Technical Decisions, 3rd edn. Sage, Newbury Park, CA Nelkin D 1995 Science controversies: The dynamics of public disputes in the United States. In: Jasanoff S, Markle G E, Petersen J C, Pinch T (eds.) Handbook of Science and Technology Studies. Sage, Thousand Oaks, CA Pinch T J, Bijker W E 1987 The social construction of facts and artifacts. Or how the sociology of science and the sociology of technology might benefit each other. In: Bijker W E, Hughes T S, Pinch T J (eds.) The Social Construction of Technological Systems: New Directions in the Sociology and History of Technology. MIT Press, Cambridge, MA Rudwick M 1985 The Great Deonian Controersy: The Shaping of Scientific Knowledge Among Gentlemanly Specialists. University of Chicago Press, Chicago Shapin S, Schaffer S 1985 Leiathan and the Air-pump: Hobbes, Boyle and the Experimental Life. Princeton University Press, Princeton, NJ Simon B 1999 Undead science: Making sense of cold fusion after the (arti)fact. Social Studies of Science 29: 61–87 Star S L, Griesemer J 1989 Institutional ecology, ‘translations’ and boundary objects: Amateurs and professionals in Berkeley’s Museum of Vertebrate Zoology, 1907–1939. Social Studies of Science 19: 387–420 Woolgar S 1976 Writing an intellectual history of scientific development: The use of discovery accounts. Social Studies of Science 6: 395–422
T. Pinch
Scientific Culture 1. Modern Scientific Culture All human societies develop knowledge about their natural and social worlds. Even hunter-gatherers possess remarkable local knowledge about plants, animals, and climate. The great world civilizations had a more complex division of labor that allowed priests, doctors, smiths, farmers, and other specialists to develop more elaborate and less local knowledge systems about astronomy, psychology, medicine, metallurgy, agriculture, and other fields. Researchers 13724
tend to reserve the term ‘science’ for the elaborated, written knowledge systems of the great world civilizations—such as ancient Greek or Chinese science—and they tend to describe other, more local knowledge systems as ethnomedicine, ethnoastronomy, ethnobotany, and so on. Many attempts have been made to describe the distinctive features of the type of science that emerged in Western Europe around 1500–1700 (Cohen 1997). The term ‘modern science’ is somewhat clearer than ‘Western science,’ not only because Western science was built up in part from non-Western sources, but also because Western science rapidly became globalized. However, two clarifications are in order. First, the term ‘scientist’ did not emerge until the nineteenth century; in earlier centuries the term ‘natural philosopher’ was more common. Second, the term ‘modern’ is used here to refer to a society characterized by a family of institutions that includes modern science, constitutional democracy, capitalism, religious pluralism, social mobility, and a universalistic legal system. Although some of those institutions can be found in some of the world’s societies prior to 1500, their development as a system occurred in the West and accelerated after 1500. As an institution, science soon found a niche in modern societies by providing research and\or ideological support for the emerging capitalist industries, the state, and the churches (e.g., ballistics, mining, navigation, taxation, public health, critiques of magic). Modern societies provided more than financial resources that supported the growth of modern science as an institution; the culture of modernity provided intellectual resources that contributed to the emergence of modern scientific inquiry. Three of the central values of modern scientific culture are empiricism, formalism, and mechanism. Although each of the distinctive features of modern scientific culture can be found in other scientific cultures (and may not be found across all modern scientific disciplines), as a family of features they have some value in characterizing modern scientific culture, and in showing its points of confluence with the cultures of modernity more generally. Empiricism refers to the value placed on observations as a means for resolving disputes about natural and social knowledge. In some sciences, empiricism was developed as experimentalism; in other words, observations could be brought into the laboratory and subjected to experiments that were, in principle, reproducible by competent members of the scientific community. However, other fields remained nonexperimental, such as the fieldwork-based sciences. Because observations are always subject to interpretation, their use as a resource for dispute resolution was in turn rooted in broader societal cultures. Scientists had to trust each other not to lie (Shapin 1994), and they required societies and journals in which they could debate and share results. Those
Scientific Culture requirements were met in the European societies that fostered the emergence of a ‘relatively autonomous intellectual class’ (Ben-David 1991, p. 304) and a public sphere of open debate (Habermas 1989). The relationship between empiricism and the broader society went beyond institutional requirements; other sectors of society were also characterized by an empirical cultural style. For example, constitutional democracies and markets were based on the ‘empiricism’ of elections and consumer purchases. Likewise, some Protestant churches replaced church dogma with the empiricism of direct interpretation of Bibles, and they emphasized knowledge of God through study of his works (Hooykaas 1990, p. 191, Merton 1973, Chap. 11). In this sense, the empiricism of science emerged as part of a more general way of resolving conflicting opinions about the world through recourse to data gathering and evidence. Formalism refers to the value placed on increasingly higher levels of generalization. As generalization progresses in science, concepts and laws tend to become increasingly abstract and\or explicit. In some fields, the generalizations took the form of mathematical laws that encompassed a wide range of more specific observations. In physics, for example, it became possible to apply the same set of formal laws to the mechanics of both terrestrial and celestial objects, and space and time were analyzed in terms of geometry (Koyre! 1965). In other fields, the generalizations took the form of increasingly formal and abstract taxonomies and systems of classification, as in early modern biology (Foucault 1970). Again, broader institutions and values contributed to the emergence of this specific form of inquiry. When researchers found support for societies that published journals where an archive of research could be located, they found the institutional resources that allowed them to conjugate their work with that of others. Likewise, as Western colonial powers expanded, they sent out scientific expeditions that incorporated local knowledges and new observations into existing research fields. As ‘Western’ science became increasingly cosmopolitan, it also became more abstract. Again, however, the relationship with the broader society went beyond institutional influences to shared values. For example, scientific formalism developed in parallel with modernizing, Western political systems and social contract theories that emphasized the abstraction of general laws from particular interests, or of a general good from individual wills. A third common feature is the value of mechanism as a form of explanation. Over time, modern science tended to rule out occult astrological forces, vitalistic life forces, and so on. When the concept of force was retained, as in Newton’s gravitational force, it was subjected to the restraints of a formal law. Again, general social and cultural resources supported this development. The attack on occult forces was consistent with reformation campaigns against popular
magic (Jacob 1988), and the disenchantment of the world that mechanistic models depended on was both supported by Christianity and intensified by some forms of Protestantism. The development and spread of clocks and constitutions provided an early metaphor of mechanism, and as new technologies and social charters were developed, new metaphors of mechanism continued to emerge (Kammen 1987, pp. 17–8, Shapin 1996, pp. 30–7). Sciences that violate the cultural value of mechanism, such as parapsychology (the study of claimed paranormal phenomena), are rejected. Likewise, the incorporation of local sciences has tended to occur after filtering out occult or vital forces. For example, in response to demand from patients and cost efficiency concerns, acupuncture and Chinese medicine are being incorporated into cosmopolitan biomedicine. However, even as the practices are being incorporated, the vitalistic concept of ‘chi’ and Chinese humoral concepts are being translated into mechanistic concepts consistent with modern biology and physics. Just as values such as empiricism, formalism, and mechanism have been used to describe the intellectual culture of modern science, so a related family of concepts has been used to describe its institutional culture. Most influential was sociologist Robert Merton’s (1973, Chap. 13) list of four central norms. Although subsequent research suggested that norms were frequently violated and better conceptualized as an occupational ideology (Mulkay 1976), Merton’s analysis did have the advantage of pointing to the fundamental preconditions for the existence of modern scientific culture. Perhaps the basic underlying institutional value is autonomy. In other words, there is a value placed on leaving alone a certified community of qualified peers to review and adjudicate the credibility of various claims of evidence and consistency, rather than have the function ceded to the fiat of kings, dictators, church leaders, business executives, and others who do not understand the research field. The value of autonomy creates an interesting tension between science and another modern institution, constitutional democracy. Although scientific communities have suffered greatly under nondemocratic conditions, their demand for some degree of autonomy based on expertise also entails a defense of a type of elitism within a modern, democratic social order that values egalitarianism. The tension is reduced by valuing egalitarianism within the field of science, that is, by reserving it for those persons who have obtained the credentials to practice as a scientist.
2. Variations in Scientific Culture(s) Going beyond the focus on modern science as a whole, much research on scientific culture has been devoted to its variations over time and across disciplines. Historians occasionally use the concept of periods as a way 13725
Scientific Culture of ordering cultural history, for example in the divisions of music history from baroque to classical to romantic. Although periodization is very approximate and relatively subjective, it nonetheless helps to point to some of the commonalities across scientific disciplines within the same time period, and some of the disjunctures in the history of science over time (Foucault 1970). Changes in conceptual frameworks and research programs across disciplines within a time period usually are part of broader cultural changes. For example, in the nineteenth century political values were frequently framed by grand narratives of progress (shared with the ‘white man’s burden’ of colonialism among other ideologies), and likewise new scientific fields such as cosmology, thermodynamics, and evolutionary biology evidenced a concern with temporal issues. In the globalized information society of the twenty-first century, concerns with complex systems have become more salient, and scientists in many disciplines are developing research programs based on ideas of information processing and systemic complexity. Science is to some extent universal in the sense that, for example, physicists throughout the world agree upon the basic laws and empirical findings of physics. However, there are also significant cultural variations in science across geographic regions. The variations are more salient in the humanities and social sciences, where distinctive theoretical frameworks often have a regional flavor. For example, one may speak of French psychoanalysis or Latin American dependency theory. The variations are also obvious for the institutional organization of scientific communities, laboratories, and university systems (Hess 1995). Furthermore, variations in the institutional organization are sometimes associated with differing research traditions in the content of a field. For example, the salary and power structure of German universities in the first third of the twentieth century favored a more theoretical approach to genetics and a concern with evolution and development. In contrast, the relatively collegial governance structures of the American universities, together with their location in agricultural schools, contributed to the development of American genetics in a more specialized, empirical, and applied direction (Harwood 1993). Sometimes variations in institutional structures and conditions are also associated with differences in laboratory technologies, which in turn restrict the selection of research problems. For example, in Japan, funding conditions for physics have in part shaped the emergence of detector designs that differ substantially from those in the USA (Traweek 1988). In addition to the temporal and national cultures of science, a third area of variation in scientific cultures involves disciplinary cultures. For example, both highenergy physics and molecular biology are considered ‘big science,’ but their disciplinary cultures are substantially different (Knorr-Cetina 1999). Regarding 13726
their intellectual cultures, in high-energy physics data are widely recognized as heavily interpretable, whereas in molecular biology the contact with the empirical world is much more hands-on and direct. Regarding the institutional cultures of the two fields, laboratories in high-energy physics are very large-scale, with the individual’s work subordinated to that of the cooperative team, even to the point of near anonymity in publication. In contrast, in molecular biology the laboratories are smaller, even for the genome project, and competition among individual researchers remains salient, with frequent disputes over priority in a publication byline.
3. Multiculturalism and Science A third approach to the topic of scientific culture involves the ongoing modernization of scientific cultures. Although modern science is increasingly international and cosmopolitan, its biases rooted in particular social addresses have constantly been revealed. Science has been and remains largely restricted to white men of the middle and upper classes in the developed Western countries (Harding 1998, Rossiter 1995, Schiebinger 1989). As scientific knowledge and institutions have spread across the globe and into historically excluded groups, they have sometimes undergone changes in response to perspectives that new groups bring to science. For example, in the case of primatology, humans and monkeys have coexisted for centuries in India. Consequently, it is perhaps not surprising that Indian primatologists developed new research programs based on monkey–human interactions that challenged the romanticism of Western primatologists’ focus on natural habitats (Haraway 1989). More generally, as scientific disciplines become globalized, different national cultural traditions intersect with the globalized disciplinary cultures to reveal unseen biases and new possibilities for research. As women and under-represented ethnic groups have achieved a place in scientific fields, they have also tended to reveal hidden biases and develop new possibilities. Although the reform movements of multiculturalism in science can be restricted to the institutional culture of science (such as reducing racism and sexism in hiring practices), they sometimes also extend to the intellectual culture or content of science. Theories and methods—particularly in the biological and social sciences—that seem transparently cosmopolitan and truthful to white, male colleagues often appear less so to the new groups. Whether it is the man-the-hunter theory of human evolutionary origins (Haraway 1989), the ‘master’ molecule theory of nucleus to cytoplasm relations (Keller 1985), or biological measures of racial inferiority (Harding 1993), scientific disciplines find themselves continually challenged to prove their universalism and to modify
Scientific Disciplines, History of concepts and theories in response to anomalies and new research. In some cases the members of excluded groups do more than replace old research methods and programs with new ones; they also create new research fields based on their identity concerns. A prime example is the work of the African American scientist George Washington Carver. Although best known in American popular culture for finding many new uses for the peanut, Carver’s research was embedded in a larger research program that was focused on developing agricultural alternatives to King Cotton for poor, rural, African American farmers (Hess 1995).
4. Conclusions It is important not to think of the embeddedness of scientific cultures in broader cultural practices as a problem of contamination. The broader cultures of modern science provide a source of metaphors and institutional practices that both inspire new research and limit its possibilities. For example, if evolutionary theory could not be thought before the progressivist culture of the nineteenth century, it cannot help but to be rethought today. Not only have new research findings challenged old models, but the broader cultural currents of complex systems and limits to growth have also inspired new models and empirical research (DePew and Weber 1995). In turn, today’s concepts and theories will be rethought tomorrow. The broader societal cultures are not weeds to be picked from the flower bed of scientific culture(s) but the soil that both nurtures and limits its growth, even as the soil itself is transformed by the growth that it supports. See also: Academy and Society in the United States: Cultural Concerns; Cultural Psychology; Cultural Studies of Science; Culture in Development; Encyclopedias, Handbooks, and Dictionaries; Ethics Committees in Science: European Perspectives; History of Science; History of Science: Constructivist Perspectives; Scientific Academies in Asia
Bibliography Ben-David J 1991 Scientific Growth. University of California Press, Berkeley, CA Cohen F 1997 The Scientific Reolution. University of Chicago Press, Chicago DePew D, Weber B 1995 Darwinism Eoling. MIT Press, Cambridge, MA Foucault M 1970 The Order of Things. Vintage, New York Habermas J 1989 The Structural Transformation of the Public Sphere. MIT Press, Cambridge, MA Haraway D 1989 Primate Visions. Routledge, New York Harding S (ed.) 1993 The Racial Economy of Science. Routledge, New York Harding S (ed.) 1998 Is Science Multicultural? Indiana University Press, Bloomington, IN
Harwood J 1993 Styles of Scientific Thought. University of Chicago Press, Chicago Hess D 1995 Science and Technology in a Multicultural World. Columbia University Press, New York Hooykaas R 1990 Science and reformation. In: Cohen, I (ed.) Puritanism and the Rise of Science. Rutgers University Press, New Brunswick, NJ Jacob M 1988 The Cultural Meaning of the Scientific Reolution. Knopf, New York Kammen M 1987 A Machine that Would Go of Itself. Knopf, New York Keller E 1985 Reflections on Gender and Science. Yale University Press, New Haven, CT Knorr-Cetina K 1999 Epistemic Cultures. Harvard University Press, Cambridge, MA Koyre! A 1965 Newtonian Studies. Chapman and Hall, London Merton R 1973 Sociology of Science. University of Chicago Press, Chicago Mulkay M 1976 Norms and ideology in science. Social Science Information 15: 637–56 Rossiter M 1995 Women Scientists in America. Johns Hopkins University Press, Baltimore, MD Schiebinger L 1989 The Mind has no Sex? Harvard University Press, Cambridge, MA Shapin S 1994 A Social History of Truth. University of Chicago Press, Chicago Shapin S 1996 The Scientific Reolution. University of Chicago Press, Chicago Traweek S 1988 Beamtimes and Lifetimes. Harvard University Press, Cambridge, MA
D. Hess
Scientific Disciplines, History of The scientific discipline as the primary unit of internal differentiation of science is an invention of nineteenth century society. There exists a long semantic prehistory of disciplina as a term for the ordering of knowledge for the purposes of instruction in schools and universities. But only the nineteenth century established real disciplinary communication systems. Since then the discipline has functioned as a unit of structure formation in the social system of science, in systems of higher education, as a subject domain for teaching and learning in schools, and finally as the designation of occupational and professional roles. Although the processes of differentiation in science are going on all the time, the scientific discipline as a basic unit of structure formation is stabilized by these plural roles in different functional contexts in modern society.
1. Unit Diisions of Knowledge Disciplina is derived from the Latin discere (learning), and it has often been used since late Antiquity and the early Middle Ages as one side of the distinction disciplina vs. doctrina (Marrou 1934). Both terms 13727
Scientific Disciplines, History of meant ways of ordering knowledge for purposes of teaching and learning. Often they were used synonymously. In other usages doctrina is more intellectual and disciplina more pedagogical, more focused on methods of inculcating knowledge. A somewhat later development among the church fathers adds to disciplina implications such as admonition, correction, and even punishment for mistakes. This concurs with recent interpretations of discipline, especially in the wake of Michel Foucault, making use of the ambiguity of discipline as a term always pointing to knowledge and disciplinary power at the same time (cf. Hoskin in Messer-Davidow et al. 1993). Finally, there is the role differentiation of teaching and learning and the distinction doctrina\disciplina is obviously correlated with it (Swoboda 1979). One can still find the same understandings of doctrina and disciplina in the literature of the eighteenth century. But what changed since the Renaissance is that these two terms no longer refer to very small particles of knowledge. They point instead to entire systems of knowledge (Ong 1958). This goes along with the ever more extensive use by early modern Europe of classifications of knowledge and encyclopedic compilations of knowledge in which disciplines function as unit divisions of knowledge. The background to this is the growth of knowledge related to developments such as the invention of printing, the intensified contacts with other world regions, economic growth and its correlates such as mining and building activities. But in these early modern developments there still dominates the archial function of disciplines. The discipline is a place where one deposits knowledge after having found it out, but it is not an active system for the production of knowledge.
2. Disciplines as Communication Systems A first premise for the rise of disciplines as production and communication systems in science is the specialization of scientists and the role differentiation attendant on it (Stichweh 1984, 1992). Specialization is first of all an intellectual orientation. It depends on a decision to concentrate on a relatively small field of scientific activity, and, as is the case for any such decision, one needs a social context supporting it, that is, other persons taking the same decision. Such decisions are rare around 1750 when encyclopedic orientations dominated among professional and amateur scientists alike, but they gained in prominence in the last decades of the eighteenth century. Second, specialization as role differentiation points to the educational system, which is almost the only place in which such specialized roles can be institutionalized as occupational roles. From this results a close coupling of the emerging disciplinary structures in science and the role structures of institutions of higher education. 13728
This coupling is realized for the first time in the reformed German universities of the first half of the nineteenth century and afterwards quickly spreads from there to other countries. Third, role differentiation in institutions of higher education depends on conditions of organizational growth and organizational pluralization. There has to be a sufficient number of organizations which must be big enough for having differentiated roles and these organizations must be interrelated in an ongoing continuity of interactions. The emergence of communities of specialists is a further relevant circumstance. In this respect the rise of disciplines is synonymous with the emergence of scientific communities theorized about since Thomas Kuhn (Kuhn 1970). Scientific communities rest on the intensification of interaction, shared expertise, a certain commonality of values, and the orientation of community members towards problem constellations constitutive of the respective discipline. Modern science is not based on the achievements of extraordinary individuals but on the epistemic force of disciplinary communities. Scientific communities are communication systems. In this respect the emergence of the scientific discipline is equivalent to the invention of new communication forms specific of disciplinary communities. First of all one may think here of new forms of scientific publications. In the eighteenth century a wide spectrum of publication forms existed; they were not, however, specialized in any way. There were instructional handbooks at the university level, journals of a general scientific nature for a regional public interested in utility, and academy journals aiming at an international public, each covering a wide subject area but with rather limited communicative effects. It was only after 1780 that in France, in Germany, and finally, in England, nationwide journals with a specific orientation on such subjects as chemistry, physics, mineralogy, and philology appeared. In contrast to isolated precursors in previous decades, these journals were able to exist for longer periods exactly because they brought together a community of authors. These authors accepted the specialization chosen by the journal; but at the same time they continually modified this specialization by the cumulative effect of their published articles. Thus the status of the scientific publication changed. It now represented the only communicative form by which, at the macrolevel of the system of science—defined originally by national but later by supranational networks—communication complexes specialized along disciplinary lines could be bound together and persist in the long run (Stichweh 1984, Chap. 6, Bazerman 1988). At the same time the scientific publication became a formal principle interfering in every scientific production process. Increasingly restrictive conditions were defined regarding what type of communication was acceptable for publication. These conditions
Scientific Disciplines, History of included the requirement of identifying the problem tackled in the article, the sequential presentation of the argument, a description of the methods used, presentation of empirical evidence, restrictions on the complexity of the argument accepted within an individual publication, linkage with earlier communications by other scientists—using citations and other techniques—and the admissibility of presenting speculative thoughts. In a kind of feedback loop, publications, as the ultimate form of scientific communication, exercised pressure on the scientific production process (i.e., on research) and were thereby able to integrate disciplines as social systems. This reorganization of the scientific production process adheres to one new imperative: the search for noelties. The history of early modern Europe was already characterized by a slow shift in the accompanying semantics associated with scientific truth, from an imperative to preserve the truth to an interest in the novelty of an invention. The success achieved in organizing traditional knowledge, as well as tendencies towards empirical methods and increased use of scientific instruments, worked toward this end. In this dimension, a further discontinuity can be observed in the genesis of the term research in the years after 1790. In early modern times the transition from the preservation to the enlargement of knowledge could only be perceived as a continual process. In contrast, research from about 1800 refers to a fundamental, and at any time realizable, questioning of the entire body of knowledge until then considered as true. Competent scientific communication then had to be based on research in this sense. What was communicated might be a small particle of knowledge, as long as it was a new particle of knowledge. Scientific disciplines then became research disciplines based on the incessant production of novelties. The link between scientific disciplines and organizations of higher education is mediated by two more organizational structures. The first of these are disciplinary careers. Specialized scientists as members of disciplinary communities do not need only specialized occupational roles. Additionally there may be a need for careers in terms of these specialized roles. This again is a condition which sharply distinguishes eighteenth from nineteenth century universities. Around 1750 you still find, even in German universities, hierarchical career patterns which implied that there was a hierarchical succession of chairs inside of faculties and a hierarchical sequence of faculties by which a university career was defined as a progression of steps through these hierarchized chairs. One could, for example, rise from a chair in the philosophical faculty to an (intellectually unrelated) chair in the medical faculty. The reorganization of universities since early nineteenth century completely discontinued this pattern. Instead of a succession of chairs in one and the same university, a scientific career meant a progression through positions inside a discipline,
which normally demands a career migration between universities. This presupposes intensified interactions and competitive relations among universities which compete for qualified personnel and quickly take up new specializations introduced elsewhere. In Germany such regularized career paths through the national university system were especially to be observed from around 1850. This pattern is again closely related to disciplinary curricula, meaning that one follows one’s disciplinary agenda not only in one’s research practice and personal career, but furthermore that there exist institutional structures favoring teaching along lines close to current disciplinary core developments. The unity of teaching and research is one famous formula for this, but this formula does not yet prescribe disciplinary curricular structures which would demand that there should be a complete organization of academic studies close to the current intellectual problem situation and systematics of a scientific discipline. Only if this is the case does there arise a professionalization of a scientific discipline, which means that a systematic organization of academic studies prepares for a non-academic occupational role which is close to the knowledge system of the discipline. Besides professionalization there is then the effect that the discipline educates its own future research practitioners in terms of the methods and theories constitutive of the discipline. A discipline doing this is not only closed on the level of the disciplinary communication processes, it is also closed on the level of socialization practices and the attendant recruitment of future practitioners (on the operational closure of modern science see Luhmann 1990, Stichweh 1990).
3. The Modern System of Scientific Disciplines It is not sufficient to analyze disciplines as individual knowledge producing systems. One has to take into account that the invention of the scientific discipline brings about first a limited number, then many scientific disciplines which interact with one another. Therefore it makes sense to speak of a modern system of scientific disciplines (Parsons 1977, p. 300ff., Stichweh 1984) which is one of the truly innovative social structures of the modern world. First of all, the modern system of scientific disciplines defines an internal enironment (milieu interne in the sense of Claude Bernard) for any scientific activity whatsoever. Whatever goes on in fields such as physics, sociology, or neurophysiology, there exists an internal environment of other scientific disciplines which compete with that discipline, somehow comment on it, and offer ideas, methods, and concepts. There is normal science in a Kuhnian sense, always involved with problems to which solutions seem to be at hand in the disciplinary tradition itself; but normal science is 13729
Scientific Disciplines, History of always commented upon by a parallel level of interdisciplinary science which arises from the conflicts, provocations and stimulations generated by other disciplines and their intellectual careers. In this first approximation it is already to be seen that the modern system of scientific disciplines is a very dynamic system in which the dynamism results from the intensification of the interactions between ever more disciplines. Dynamism implies, among other things, ever changing disciplinary boundaries. It is exactly the close coupling of a cognitiely defined discipline and a disciplinary community which motivates this community to try an expansionary strategy in which the discipline attacks and takes over parts of the domain of other disciplines (Westman 1980, pp. 105–6). This was wholly different in the disciplinary order of early modern Europe, in which a classificatory generation of disciplinary boundaries meant that the attribution of problem domains to disciplines was invariable. If one decided to do some work in another domain, one had to accept that a change over to another discipline would be necessary to do this. Closely coupled to this internally generated and self-reinforcing dynamics of the modern system of scientific disciplines is the openness of this system to new disciplines. Here again arises a sharp difference to early modern circumstances. In early modern Europe there existed a closed and finite catalogue of scientific disciplines (Hoskin 1993, p. 274) which was related to a hierarchical order of these disciplines (for example philosophy was a higher form of knowledge than history, and philosophy was in its turn subordinated to faculty studies such as law and theology). In modern society no such limit to the number of disciplines can be valid. New disciplines incessantly arise, some old ones even disappear or become inactive as communication systems. There is no center and no hierarchy to this system of the sciences. Nothing allows us to say that philosophy is more important than natural history or physics more scientific than geology. Of course, there are asymmetries in influence processes between disciplines, but no permanent or stable hierarchy can be derived from this. The modern system of scientific disciplines is a global system. This makes a relevant difference from the situation of the early nineteenth century, in which the rise of the scientific discipline seemed to go along with a strengthening of national communities of science (Crawford et al. 1993, Stichweh 1996). This nationalization effect, which may have had to do with a meaningful restriction of communicative space in newly constituted communities, has since proved to be only a temporary phenomenon, and the ongoing dynamics of (sub-) disciplinary differentiation in science seems to be the main reason why national communication contexts are no longer sufficient infrastructures for a rapidly growing number of disciplines and subdisciplines. 13730
4. The Future of the Scientific Discipline The preponderance of subdisciplinary differentiation in the late twentieth century is the reason most often cited for the presumed demise of scientific discipline postulated by a number of authors. But one may object to this hypothesis on the ground that a change from disciplinary to subdisciplinary differentiation processes does not at all affect the drivers of internal differentiation in modern science: the relevance of an internal environment as decisive stimulus for scientific variations, the openness of the system to disciplinary innovations, the nonhierarchical structure of the system. Even if one points to an increasing importance of interdisciplinary ventures (and to problem-driven interdisciplinary research) which one should expect as a consequence of the argument on the internal environment of science, this does not change the fact that disciplines and subdisciplines function as the form of consolidating interdisciplinary innovations. And, finally, there are the interrelations with the external environments of science (economic, political, etc.), which in twentieth and twenty-first century society are plural environments based on the principle of functional differentiation. Systems in the external environment of science are dependent on sufficiently stable addresses in science if they want to articulate their needs for inputs from science. This is true for the educational environment of science which has to organize school and higher education curricula in disciplinary or interdisciplinary terms, for role structures as occupational structures in the economic environment of science, and for many other demands for scientific expertise and research knowledge which always must be able to specify the subsystem in science from which the respective expertise may be legitimately expected. These interrelations based on structures of internal differentiation in science which have to be identifiable for outside observers are one of the core components of modern society which, since the second half of the twentieth century, is often described as knowledge society. See also: Disciplines, History of, in the Social Sciences; History and the Social Sciences; History of Science: Constructivist Perspectives; Human Sciences: History and Sociology; Knowledge Societies; Scientific Academies, History of; Scientific Culture; Scientific Revolution: History and Sociology; Teaching Social Sciences: Cultural Concerns; Universities and Science and Technology: Europe; Universities and Science and Technology: United States; Universities, in the History of the Social Sciences
Bibliography Bazerman C 1988 Shaping Written Knowledge: The Genre and Actiity of the Experimental Article in Science. University of Wisconsin Press, Madison, WI
Scientific Discoery, Computational Models of Crawford E, Shinn T, So$ rlin S 1993 Denationalizing Science. The Contexts of International Scientific Practice. Kluwer, Dordrecht, The Netherlands Hoskin K W 1993 Education and the genesis of disciplinarity: The unexpected reversal. In: Messer-Davidow E, Shumway D R, Sylvan D J (eds.) Knowledges: Historical and Critical Studies in Disciplinarity. University Press of Virginia, Charlottesville, VA, pp. 271–305 Kuhn T S 1970 The Structure of Scientific Reolutions, 2nd edn. University of Chicago Press, Chicago Luhmann N 1990 Die Wissenschaft der Gesellschaft. Suhrkamp, Frankfurt am Main, Germany Marrou H I 1934 ‘Doctrina’ et ‘Disciplina’ dans la langue des pe' res de l’e! glise. Archius Latinitatis Medii Aei 9: 5–25 Messer-Davidow E, Shumway D R, Sylvan D J (eds.) 1993 Knowledges: Historical and Critical Studies in Disciplinarity. University Press of Virginia, Charlottesville, VA Ong W J 1958 Ramus, Method, and the Decay of Dialogue: From the Art of Discourse to the Art of Reason. Harvard University Press, Cambridge, MA Parsons T 1977 Social Systems and the Eolution of Action Theory. Free Press, New York Stichweh R 1984 Zur Entstehung des Modernen Systems Wissenschaftlicher Disziplinen—Physik in Deutschland 1740–1890. Suhrkamp, Frankfurt am Main, Germany Stichweh R 1990 Self-organization and autopoiesis in the development of modern science. In: Krohn W, Ku$ ppers G, Nowotny H (eds.) Selforganization—Portrait of a Scientific Reolution. Sociology of the Sciences, Vol. XIV. Kluwer Academic Publishers, Boston, pp. 195–207 Stichweh R 1992 The sociology of scientific disciplines: On the genesis and stability of the disciplinary structure of modern science. Science in Context 5: 3–15 Stichweh R 1996 Science in the system of world society. Social Science Information 35: 327–40 Swoboda W W 1979 Disciplines and interdisciplinarity: A historical perspective. In: Kockelmans J J (ed.) Interdisciplinarity and Higher Education. University Park, London, pp. 49–92 Westman R S 1980 The astronomers’s role in the sixteenth century: A preliminary study. History of Science 18: 105–47
R. Stichweh
Scientific Discovery, Computational Models of Scientific discovery is the process by which novel, empirically valid, general, and rational knowledge about phenomena is created. It is, arguably, the pinnacle of human creative endeavors. Many academic and popular accounts of great discoveries surround the process with mystery, ascribing them to a combination of serendipity and the special talents of geniuses. Work in Artificial Intelligence on computational models of scientific reasoning since the 1970s shows that such accounts of the process of science are largely mythical. Computational models of scientific discovery are computer programs that make discover-
ies in particular scientific domains. Many of these systems model discoveries from the history of science or simulate the behavior of participants solving scientific problems in the psychology laboratory. Other systems attempt to make genuinely novel discoveries in particular scientific domains. Some have produced new findings of sufficient worth that the discoveries have been published in mainstream scientific journals. The success of these models provides some insights into the nature of human cognitive processes in scientific discovery and addresses some interesting issues about the nature of scientific discovery itself (see Scientific Reasoning and Discoery, Cognitie Psychology of ).
1. Computational Models of Scientific Discoery Most computational models of discovery can be conceptualized as performing a recursive search of a space of possible states, or expressions, defined by the representation of the problem. Procedures are used to search the space of legal states by manipulating the expressions and using tests of when the goal or subgoals have been met. To manage the search, which is typically subject to potential combinatorial explosion, heuristics are used to guide the selection of appropriate operators. This is essentially an application of the theory of human problem solving as heuristic search within a symbol processing system (Newell and Simon 1972). For example, consider BACON (Langley et al. 1987) an early discovery program which finds algebraic formulas as parsimonious descriptions of quantitative data. States in the problem search space of BACON include simple algebraic formulas; such as P\D or P#\D, where, for instance, P is the period of revolution of planets around the sun and D is their distance from the sun. Tests in BACON attempt to find how closely potential expressions match the given quantitative data. Given quantitative data for the planets of the solar system, one step in BACON’s discovery path finds that neither P#\D nor P\D are constant and that the first expression is monotonically increasing with respect to the second. Given this relation between the expressions BACON applies its operator to give the product of the terms, i.e., P$\D#. This time the test of whether the expression is constant, within a given margin of error, is true. P$\D# l constant is one of Kepler’s planetary motion laws. For more complex cases with larger numbers of variables, BACON uses discovery heuristics based on notions of symmetry and the conservation of higher order terms to pare down the search space. The heuristics use the underlying regularities within the domain to obviate the need to explore parts of the search space that are structurally similar to previously explored states. Following such an approach, computational models have been developed to perform tasks spanning a full 13731
Scientific Discoery, Computational Models of spectrum of theoretical activities including the formation of taxonomies, discovering qualitative and quantitative laws, creation of structural models and the development of process models (Langley et al 1987, Shrager and Langley 1990, Cheng 1992). The range of scientific domains covered is also impressive, ranging from physics and astronomy, to chemistry and metallurgy, to biology, medicine, and genetics. Some systems have produced findings that are sufficiently novel to be worthy of publication in major journals of the relevant discipline (Valdes-Perez 1995).
2. Scope of the Models Computational models of scientific discovery have almost exclusively addressed theory formation tasks. However, the modeling of experiments has not been completely neglected as models have been built that design experiments, to a limited extent, by using procedures to specify what properties should be manipulated and measured in an experiment and the range of magnitudes over which the properties should be varied (Kulkarni and Simon 1988, Cheng 1992). For these systems, the actual experimental results are either provided by the user of the system or generated by a simulated experiment in the software. Discovery systems have also been directly connected to a robot which manipulates a simple experimental setup so that data collected from the instruments can be fed to the system directly, so eliminating any human intervention (Huang and Zytkow 1997). Nevertheless, few systems have simulated or supported substantial experimental activities, such as observing or creating new phenomena, designing experiments, inventing new experimental apparatus, developing new experimental paradigms, establishing the reliability of experiments, or turning raw data into evidence. This perhaps reflects a fundamental difference between the theoretical and experimental sides of science. While both clearly involve abstract conceptual entities, experimentation is also grounded in the construction and manipulation of physical apparatus, which involves a mixture of sophisticated perceptual abilities and motor skills. Developing models of discovery that include such capabilities would necessarily require other areas within AI beyond problem solving, such as image processing and robotics. The majority of discovery systems model a single theory formation task. The predominance of such systems might be taken as the basis for a general criticism of computational scientific discovery. The models are typically poor imitations of the diversity of activities in which human scientists are engaged and, perhaps, it is from this variety that scientific creativity arises. Researchers in this area counter such arguments by claiming that the success of such single task systems is a manifestation of the underlying nature of the 13732
process of discovery, that it is composed of subprocesses or tasks that are relatively autonomous. More complex activity can be modeled by assembling systems that perform one task into a larger system, with the inputs to a particular component subsystem being the outputs of other systems. The handful of models that do perform multiple tasks demonstrate the plausibility of this claim (e.g., Kulkarni and Simon 1988, Cheng 1992). The organization of knowledge structures and procedures in those systems exploits the hierarchical decomposition of the overall process into tasks and subtasks. This in turn raises the general question about the number and variety of different tasks that constitute scientific discovery and the nature of their interactions. What distinct search spaces are involved and how is information shared among them? Computational models of scientific discovery provide some insight into this issue. At a general level, many models can be characterized in terms of two spaces, one for potential hypotheses and the other a space of instances or sets of data (Simon and Lea 1974). Scientific discovery is then viewed as the search of each space mutually constrained by the search of the other. Inferring a hypothesis dictates the form of the data needed to test the hypothesis, while the data itself will determine whether the hypothesis is correct and suggest in what ways it should be amended (Kulkarni and Simon 1988). This is an image of scientific discovery that places equal importance on theory and experiment, portraying the overall process as a dynamic interaction between both components. This approach is applicable both to disciplines in which individual scientists do the theorizing and experimenting and to disciplines in which these activities are distributed among different individuals or research groups. The search of the theoretical and experimental spaces can be further decomposed into additional subspaces; for example, Cheng (1992) suggests three subspaces for hypotheses, models, and cases under the theory component, and spaces of experimental classes, setups, and tests under the experimental component.
3. Deeloping Computational Models of Discoery One major advantage of building computational models over other approaches to the study of scientific discovery is the precision that is imposed by writing a running computer program. Ambiguities and inconsistencies in the concepts used to describe discovery processes become apparent when attempting to encode them in a programming language. Another advantage of modeling is the ability to investigate alternative methods or hypothetical situations. Different versions of a system may be constructed embodying, say, competing representations to investigate the difficulty
Scientific Eidence: Legal Aspects of making the discovery with the alternatives. The same system can be run with different sets of data, for example, to explore whether a discovery could have been made had there been less data, or had different data been available. Many stages are involved in the development of the models, including: formulation of the problem, engineering appropriate problem representations, selecting and organizing data, design and redesign of the algorithm, actual invocation of the algorithm, and filtering and interpretation of the findings of the system (Langley 1998). Considering the nature and relative importance of these activities in the development of systems provides further insight into the nature of scientific discovery. In particular, the design of the representation appears to be especially critical to the success of the systems. This implies that generally in scientific discovery finding an effective representation may be fundamental to the making of discoveries. This issue has been directly addressed by computational models that contrast the efficacy of different representations for modeling the same historical episode (Cheng 1996). Consistent with work in cognitive science, diagrammatic representations may in some cases be preferable to informationally equivalent propositional representations. Although computational models argue against any special abilities of great scientists beyond the scope of conventional theories of problem solving, the models suggest that the ability of some scientists to modify or create new representations may be an explanation, at least in part, of why they were the ones to succeed.
4. Conclusions and Future Directions Given the extent of the development work necessary on a discovery system, it seems appropriate to attribute discoveries as much to the developer as to the system itself, although without the system many of the novel discoveries would not have been possible. This does not imply that machine discovery is impossible, but that care must be taken in delimiting the capabilities of discovery systems. Further, the ability of the KEKEDA system (Kulkarni and Simon 1988) to change its goals to investigate any surprising phenomenon it discovers suggests that systems can be developed that would filter and interpret the output of existing systems, by constraining the search of the space defined by the outputs of those systems using metrics based on notions of novelty. Developing such a system, or other systems that find problems or that select appropriate representations, will require the system to possess a substantially more extensive knowledge of the target domain. Such knowledge based systems are costly and time consuming to build, so it appears that the future of discovery systems will be more as collaborative support systems for domain
scientists rather than fully autonomous systems (Valdes-Perez 1995). Such systems will exploit the respective strengths of domain experts and the computational power of the models to compensate for each others’ limitations. See also: Artificial Intelligence: Connectionist and Symbolic Approaches; Artificial Intelligence in Cognitive Science; Artificial Intelligence: Search; Deductive Reasoning Systems; Discovery Learning, Cognitive Psychology of; Intelligence: History of the Concept; Problem Solving and Reasoning, Psychology of; Problem Solving: Deduction, Induction, and Analogical Reasoning; Scientific Reasoning and Discovery, Cognitive Psychology of
Bibliography Cheng P C-H 1992 Approaches, models and issues in computational scientific discovery. In: Keane M T, Gilhooly K (eds.) Adances in the Psychology of Thinking. HarvesterWheatsheaf, Hemel Hempstead, UK, pp. 203–236 Cheng P C-H 1996 Scientific discovery with law encoding diagrams. Creatiity Research Journal 9(2&3): 145–162 Huang K-M, Zytkow J 1997 Discovering empirical equations from robot-collected data. In: Ras Z, Skowron A (eds.) Foundations of Intelligent Systems. Springer, Berlin Kulkarni D, Simon H A 1988 The processes of scientific discovery: The strategy of experimentation. Cognitie Science, 12: 139–75 Langley P 1998 The computer-aided discovery of scientific knowledge. In: Proceedings of the First International Conference on Discoery Science. Springer, Berlin Langley P, Simon H A, Bradshaw G L, Zytkow J M 1987 Scientific Discoery: Computation Explorations of the Creatie Processes. MIT Press, Cambridge, MA Newell A, Simon H A 1972 Human Problem Soling. PrenticeHall, Englewood Cliffs, NJ Shrager J, Langley P (eds.) 1990 Computational Models of Scientific Discoery and Theory Formation. Morgan Kaufmann, San Mateo, CA Simon H A, Lea G 1974 Problem solving and rule induction: a unified view. In: Gregg L W (ed.) Knowledge and Cognition. Lawrence Erlbaum, Potomac, MD, pp. 105–127 Valdes-Perez R E 1995 Some recent human\computer discoveries in science and what accounts for them. AI Magazine 16(3): 37–44
P. C.-H. Cheng
Scientific Evidence: Legal Aspects Expertise, scientific and otherwise, has been part of the legal landscape for centuries (Hand 1901). Over the last decades of the twentieth century the role of scientific evidence in the law has expanded rapidly, in both regulatory settings and in litigation. Statutes and 13733
Scientific Eidence: Legal Aspects treaties routinely require agencies to provide scientific justifications for regulatory decisions within and among nations. National and international policy disputes are increasingly fought out within the risk assessment rhetoric of science (Craynor 1993). Courts are another large consumer of science. In both criminal and civil cases, parties believe scientific testimony will make their case stronger. As science’s role has grown, so has interest in the relationship between law and science. This essay reviews the present state of knowledge concerning the science–law relationship and mentions several areas ripe for investigation.
1. What Law Wants from Science In both the administrative and the courtroom context, law is most frequently interested in acquiring scientific answers to practical questions of causation (e.g., does exposure to airborne asbestos cause lung cancer) and measurement (e.g., what is the alcohol level in a driver’s blood). Both of these questions entail questions as to whether specific techniques (e.g., a breathalyzer) are capable of producing reliable measurements. Theoretical questions per se are frequently of secondary interest, useful insofar as they help courts and regulators to chose among conflicting measurements, extrapolations, or causal assertions. Courts and agencies are both interested in questions of general and specific causation. Agencies are interested in the effects of nitrous oxide on the environment and the specific level of emissions from a particular plant. Courts may be interested in whether a given level of airborne asbestos causes lung cancer. And they may need to determine if a particular plaintiff’s lung cancer was caused by asbestos exposure. It is often much more difficult to achieve a scientifically rigorous answer to this latter type of question. An important difference between courts and agencies is the level of proof necessary to reach a conclusion. In administrative settings agencies may prevail if they can show that a substance poses a danger or a risk. In a courtroom however, in order to prevail, the plaintiff will be required to show both general and specific causation. Scientific evidence that is sufficient in the regulatory context may be considered insufficient in the court context. For example, animal studies showing a relationship between saccharin consumption and cancer may be sufficient to cause an agency to impose regulations on human exposure to the sweetener, but would not be sufficient to permit a groups of plaintiffs to prevail on a claim that they were actually injured by consuming saccharin in diet soft drinks, or even on the general causation claim that saccharin at human dose levels causes cancer in any humans. 13734
2. Legal Control of Scientific Eidence One of the more interesting aspects of the use of science in courts is the legal effort to control the terms of the law–science interaction. Judicial control of scientific experts is shaped by whether the court is operating in an inquisitorial or an adversarial system (van Kampen 1998). In inquisitorial systems, e.g., Belgium, Germany, France, Japan, the judge plays a large role in the production of evidence. Experts are almost always court appointed and are asked to submit written reports. Parties may be given the opportunity to object to a particular expert, question the expert about the opinion rendered, or hire their own expert to rebut the court-appointed expert; but it is very difficult to attack the admissibility of an expert’s report successfully (Langbein 1985, Jones 1994). In adversarial systems the parties usually have far greater control over the production of evidence, including the selection and preparation of expert witnesses. Courtappointed experts are rare (Cecil and Willging 1994). A similar pattern can be observed in the regulatory arena. Jasanoff (1991) compares the relatively closed, consensual, and non-litigious British regulatory approach to the open, adversarial and adjudicatory style found in the United States. Clearly, legal organization and legal culture shape the way in which legal systems incorporate science. In turn, legal organization is related to larger cross-cultural differences. For example, open, adversarial approaches are more pronounced in societies that are more individualistic and lower on measures of power distance (Hofstede 1980). Judicial control of scientific expert testimony has become a high profile issue in the United States. Critics complain that the combination of adversarial processes and the use of juries in both criminal and civil trials encourages the parties to introduce bad science into the trial process. They argue, with some empirical support, that the use of experts chosen by the parties produces individuals who by their own account stray relatively further from Merton’s four norms of science—universalism, communism, disinterestedness, and organized skepticism (Merton 1973). They submit that jurors are too easily swayed by anyone who is called a scientist regardless of the quality of the science supporting the expert’s position. (See Expert Witness and the Legal System: Psychological Aspects.) Many now call upon judges to play a greater role in selecting and supervising experts, a role closer to that found in inquisitorial systems (Angell 1996). The preference for an inquisitorial style makes several assumptions about the nature of science and the proper relationship between law and science. Similar assumptions underlie the admissibility rules employed by courts. The history of admissibility decisions in American courts provides a useful way to explore these assumptions. Precisely because the knowledge possessed by the scientist is beyond the ken of the court, the judge (or
Scientific Eidence: Legal Aspects the jury) often is not in a good position to determine if what the expert is offering is helpful. In a 1923 decision a federal court offered one solution to this problem; acquiesce to the judgment of the community of elites who are the guardians of each specialized area of knowledge. Under the so-called Frye rule experts may testify only if the subject of their testimony had reached ‘general acceptance’ in the relevant scientific community. In 1993 in Daubert vs. Merrell Dow Pharmaceuticals, Inc., and subsequent opinions, the United States Supreme Court moved away from the Frye rule. In its place federal courts have developed a nonexclusive multifactor test designed to assess the validity of the science underlying the expert’s testimony. Factors the courts have mentioned include: whether the theory or technique underlying the expert’s testimony is falsifiable and had been tested; the error rate of any techniques employed; whether the expert’s research was developed independent of litigation; whether the testifying expert exercised the same level of intellectual rigor that is routinely practiced by experts in the relevant field; whether the subject matter of the testimony had been subjected to peer review and publication in refereed journals; and, in a partial retention of the Frye test, whether the theory or technique had achieved general acceptance in the relevant scientific community. The Daubert-inspired test offers an alternative to acquiescence, a do-it-yourself effort on the part of the courts. A number of the factors are consistent with an adversarial legal system that institutionalizes mistrust of all claims to superior authority. The adversary system itself is consistent with the values of a lowpower distance culture that affords relatively less legitimacy to elite authoritative opinion and, therefore, is skeptical that scientists are entitled to a privileged language of discourse from which others are excluded. It is also consistent with a view of science strongly influenced by other forces in society. These are perspectives on the scientific enterprise that are associated with those who adopt a social constructionist view of science (Shapin 1996, Pickering 1992).
3. Legal Understanding of the Scientific Enterprise Daubert, however, offers anything but a social constructionist test if by this we mean that scientific conclusions are solely the result of social processes within the scientific community. At its core, the opinion requires of judges that they become sufficiently knowledgeable about scientific methods so that they can fairly assess the validity of evidence offered at trial. This requirement that scientific testimony must pass methodological muster reflects a positivist approach that is slanted toward a Baconian view of science. The opinion cites with favor a
Popperian view of how to distinguish the scientific enterprise from other forms of knowledge (Popper 1968). In this regard, the opinion is not unique. In their use of scientific evidence, both courts and administrative agencies seem to distinguish the process of science from its products. They accept the constructionist insight that the process of doing science is a social enterprise and is subject to the buffeting, often distorting winds of social, political, economic, and legal influences. At the same time, courts, agencies, and legislatures cling to a realist belief that the products of science may state a truth about the world, or at least something so similar to truth as it is commonly understood at a given point in history that the particular discipline of law does not need to concern itself with the difference. The legal system’s view of science adopts a strong version of what Cole (1992, p.x) calls a realist–constructivist position, i.e., science is socially constructed both in the laboratory and in the wider community, but the construction is constrained by input from the empirical world. It rejects what he calls a relativist–constructionist position that claims nature has little or no influence on the cognitive content of science. The focus on methods is a search for some assurance that the expert has given the empirical world a reasonable opportunity to influence and constrain the expert’s conclusions. Ultimately, the law’s epistemology with respect to science holds that there are a set of (social) practices often given the shorthand name ‘the scientific method’ which increase the likelihood that someone will make positive contributions to knowledge; a set of practices that scientists themselves frequently point to as the sources of past scientific success (Goldman 1999). There is a large dose of pragmatism in all of this, of course, and the Daubert rule itself has been cited as an example of ‘the common law’s genius for muddling through on the basis of experience rather than logic’ (Jasanoff 1995, p. 63). Not surprisingly, some have criticized the courts for failing to adopt a philosophically coherent admissibility rule (Schwartz 1997). The court’s admissibility rulings do seem to have proceeded in happy obliviousness to the ‘science wars’ that arguably began with Fleck (1979), flourished with Kuhn (1962) and raged for much of the last half of the twentieth century between the defenders of a more traditional, positivist view of science and those critics who emphasize its historical, political, social, and rhetorical aspects (Leplin 1997, Latour 1999). The same could be said of administrative use of science. The rejection of relativist views of science does not mean that all legal actors hold identical views. It would be valuable to map the beliefs of legal actors on central disputes in the science wars, and how these beliefs impact their use of science. For example, if, as seems likely, plaintiff personal injury lawyers in the United States hold a more relativist view of science, how, if at all, does this affect their selection and 13735
Scientific Eidence: Legal Aspects preparation of experts and, in turn, how are these experts received in courts?
4. Law–Science Interdependence Law’s approach to science should not be understood in terms of science alone but rather in terms of the law–science interaction. There are several dimensions to this relationship. First, although modern legal systems may recognize that scientists are influenced by the social world around them, permitting radical deconstruction that would undermine science’s claim to special status is difficult to imagine (Fuchs and Ward 1994). The modern state is increasingly dependent upon science as a source of legitimacy. By turning to science for solutions to complex environmental and safety issues, legislatures are able to avoid making difficult political choices while giving the appearance of placing decision-making in the hands of apparently neutral experts who are held in very high esteem relative to other elites (Lawler 1996). The advantages of this approach are so great that agencies frequently engage in a ‘science charade’ in which they wrap their decision in the language of science even when there is very little research supporting a regulation (Wagner 1995). Second, and related to the first observation, the law–science interaction is one of the important ways in which the state helps to produce and maintain science’s dominant position in modern Western society. In both its administrative and judicial actions, the law defines what is and what is not scientific knowledge, and thereby assists science in important boundary maintenance work of excluding many from the community of ‘scientists.’ For example, in the case of United States vs. Starzecpyzel, the court permitted the state’s handwriting experts to testify on the question of whether the defendants had forged documents, but only if they did not refer to themselves as ‘scientists.’ Moreover, legal decisions contain an implicit epistemology that reinstitutionalizes one view of the nature of scientific knowledge. When courts and other institutions of the state reject a relativist view of science that argues the empirical world has little, if any, influence on what is accepted as true by the scientific community, they help to marginalize anyone who adopts this position. There can be little doubt that the view is given little or no attention in law. A cursory search for the names of individuals most associated with these debates in American federal cases finds over 100 references to Popper alone, but no more than one or two passing references to the more prominent critics of traditional understandings of the scientific enterprise. Third, the exact contours of the law–science interaction are shaped by a society’s legal structure and its legal culture. For example, the American legal system’s realist–constructivist understanding of science fits neatly with its own normative commitment to 13736
both ‘truth’ and ‘justice’ as legitimate dispute resolution goals. Ideally, cases should be correctly decided, should arrive at the truth. A realist view of science is consistent with the idea that a court may arrive at the correct outcome. But the truth is contested, and the courts should also give litigants the sense that they were listened to; that they received procedural justice (Tyler 1990). A constructivist understanding of the scientific enterprise legitimates the right of each party to find an expert who will present their view of the truth. We may hypothesize that all legal systems tend toward an epistemology of science and the scientific enterprise that fit comfortably within their dominant methods of law-making and dispute settlement. We might expect, therefore, that inquisitorial systems would be more skeptical of a constructivist view of the scientific enterprise than is the case in adversarial legal systems.
5. Effects of Growing Interdependence As this essay attests, science’s impact on legal processes grows apace. In many areas, an attorney unarmed with scientific expertise operates at a significant disadvantage. A growing number of treatises attempt to inform lawyers and judges of the state of scientific knowledge in various areas (Faigman et al. 1997). At the level of the individual case, there is some evidence that scientific norms are altering the way scientifically complex lawsuits are tried. Restrictive admissibility rules are one of several ways that courts in the United States may restrict traditional adversary processes when confronted with cases involving complex scientific questions. What of law’s effects on science? If the supply of science on some issue is affected by demand, legal controversy should draw scientists to research topics they might otherwise have eschewed. There is evidence that this does occur. The subject matter of scientific research is shaped by legal controversy in a wide number of areas, from medical devices (Sobol 1991) to herbicides (Schuck 1986). Some have argued that law’s interest has an additional effect of causing the production of ‘worse’ science (Huber 1991), but it might also be argued that in areas such as DNA testing, law’s interest has produced better science and better technology than might otherwise have existed. On the other hand, the threat of legal controversy may impede research into some areas, such as the development of new contraceptive devices. Even if we believe all science is produced through a process of social construction, it is also the case that most scientists believe that their work is constrained by the empirical world (Segerstra/ le 1993). Moreover, they often attempt to surround themselves with a social structure that keeps society at a distance. Scientific groups are increasingly attempting to propa-
Scientific Instrumentation, History and Sociology of gate ethical standards for those who offer expert testimony in courts, reflecting the widespread belief that scientists who typically appear in legal arenas differ from their colleagues on these dimensions. Finally, legal use of science affects the closure of scientific debates (Sanders 1998). On the one hand, law may perpetuate controversy by supplying resources to both sides. On the other hand, it may help to bring a dispute to closure by authoritatively declaring one side to be correct. See also: Expert Systems in Cognitive Science; Expert Testimony; Expert Witness and the Legal System: Psychological Aspects; Parties: Litigants and Claimants; Science and Law; Science and Technology Studies: Experts and Expertise; Statistics as Legal Evidence
Bibliography Angell M 1996 Science on Trial: The Clash of Medical Eidence and the Law in the Breast Implant Case. Norton, New York Cecil J S, Willging T E 1994 Court appointed experts. In Federal Judicial Center Reference Manual on Scientific Eidence. West, St. Paul, MN Cole S 1992 Making Science: Between Nature and Society. Harvard University Press, Cambridge, MA Craynor C F 1993 Regulating Toxic Substances: A Philosophy of Science and the Law. Oxford University Press, New York Daubert vs. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993) Faigman D L, Kaye D, Saks M, Sanders J 1997 Modern Scientific Eidence: The Law and Science of Expert Testimony. West Group, St. Paul, MN Fleck L 1979 Genesis and Deelopment of a Scientific Fact. University of Chicago Press, Chicago; IL Fuchs S, Ward S 1994 What is deconstruction, and where and when does it take place? Making facts in science, building cases in law. American Sociological Reiew 59: 481–500 Goldman A I 1999 Knowledge in a Social World. Clarendon Press, Oxford UK Hand L, Learned 1901 Historical and practical considerations regarding expert testimony. Harard Law Reiew 15: 40 Hofstede G H 1980 Culture’s Consequences: International Differences in Work-related Values. Sage, Beverly Hills, CA Huber P W 1991 Galileo’s Reenge: Junk Science in the Courtroom. Basic Books, New York Jasanoff S 1991 Acceptable evidence in a pluralistic society. In: Mayo G D, Hollander R D (eds.) Acceptable Eidence: Science and Values in Risk Management. Oxford University Press, Oxford; UK Jasanoff S 1995 Science at the Bar: Law, Science, and Technology in America. Harvard University Press, Cambridge, MA Jones C A G 1994 Expert Witnesses: Science, Medicine, and the Practice of Law. Clarendon Press, Oxford Kuhn T S 1962 The Structure of Scientific Reolutions. University of Chicago Press, Chicago, IL Langbein J H 1985 The German advantage in civil procedure. Uniersity of Chicago Law Reiew 52: 823–66 Latour B 1999 Pandora’s Hope: Essays on the Reality of Science Studies. Harvard University Press, Cambridge, MA
Lawler A 1996 Support for science stays strong. Science 272: 1256 Leplin J 1997 A Noel Defense of Scientific Realism. Oxford University Press, New York Merton R K 1973 The Sociology of Science: Theoretical and Empirical Inestigations. University of Chicago Press, Chicago, IL Pickering A (ed.) 1992 Science as Practice and Culture. University of Chicago Press, Chicago, IL Popper K R 1968 The Logic of Scientific Discoery. Hutchinson, London Sanders J 1998 Bendectin On Trial: A Study of Mass Tort Litigation. University of Michigan Press, Ann Arbor, MI Schuck P H 1986 Agent Orange On Trial: Mass Toxic Disaster in the Courts. Belknap Press of Harvard University Press, Cambridge, MA Schwartz A 1997 A ‘dogma of empiricism’ revisited: Daubert vs. Merrell Dow Pharmaceuticals, Inc. and the need to resurrect the philosophical insight of Frye vs. United States. Harard Journal of Law and Technology 10: 149 Segerstra/ le U 1993 Bringing the scientist back in the need for an alternative sociology of scientific knowledge. In: Brante T, Fuller S, Lynch W (eds.) Controersial Science: From Content to Contention. State University of New York Press, Albany, NY Shapin S 1996 The Scientific Reolution. University of Chicago Press, Chicago, IL Sobol R B 1991 Bending The Law: The Story of the Dalkon Shield Bankruptcy. University of Chicago Press, Chicago, IL Tyler T R 1990 Why People Obey the Law. Yale University Press, New Haven, CT United States v. Starzecpyzel, 880 F. Supp. 1027 (S.D.N.Y. 1995) van Kampen P T C 1998 Expert Eidence Compared: Rules and Practices in the Dutch and American Criminal Justice System. Intersentia Rechtswetenschappen, Antwerp, Belgium Wagner W E 1995 The science charade in toxic risk regulation. Columbia Law Reiew 95: 1613–723
J. Sanders
Scientific Instrumentation, History and Sociology of In the course of the last seventy years, historical and sociological analysis of instrumentation has changed considerably. World War II serves as an important watershed. Before the war, ‘instrumentation’ referred mainly to scientific instruments. They figured in experimentation, whose purpose was to demonstrate the truths of theory by making theoretical claims visible. After 1945 the ways in which instrumentation has been perceived and the functions attributed to it have multiplied and expanded, taking into account not only devices in the science laboratory, but also apparatus used in industry, government, health care, the military, and beyond. Instrumentation is now identified in many areas of science and technology 13737
Scientific Instrumentation, History and Sociology of studies as central to research, engineering, industrial production, and to the processes of innovation. It is perceived as a mechanism that conditions the content of knowledge and affects the organization of work and even broader societal interactions. In the writings of early twentieth-century students of science, instrumentation was seldom discussed and never highlighted. Historian-philosophers of science like Gaston Bachelard (1933, 1951) saw science chiefly in terms of the development of new theory. In this idealist historiographical tradition, experimentation received little attention, and instrumentation was only treated as a prop for experiments whose function was to document the discoveries embodied in scientific theories. Instruments did not invite study. Questions of instrument design, construction, and use, and their limitations, went unattended. This is not to suggest, however, that there was no interest in scientific devices at the time. A few scholars were fascinated by them, and they strove to preserve apparatus and to unearth new devices. They were interested in the technical intricacies and novelty of instruments, for example those of Galileo Galilei. Devices were often viewed as antiquities, and due to this focus the distinction between the work of curators of science museums and instrument scholarship was sometimes a fine one. Here again, instrumentation was not treated as an active component in the knowledge production process, nor was it regarded as problematic in terms of its invention, use, or impact on the organization of research and science and technology communities. Stimulating new perspectives in scientific instrumentation arose in the 1970s and 1980s in connection with historical and sociological investigations of post1945 big science. For a long time most of the research that portrayed instrumentation as a central component of science and technology focused on devices in the physical sciences. The classic study by John Heilbron and Robert Seidel (1989) of the Berkeley cyclotron in the 1930s is emblematic of the new place of instrumentation in contemporary historiography. Issues of design, finance, engineering, and construction lay at the center of the cyclotron study. The cyclotron was portrayed as an instrument whose technical and social facets involved uncertainties. It was not a ‘pure’ instrument that reflected science’s drive to probe the physical world. While the cyclotron in part served this objective, the instrument also reflected the economic and institutional environment of the San Francisco region, the hope for better healthcare, financial concessions wrung from the government, and involvement by wealthy research foundations and industry. This history demonstrates that scientific instrumentation may be guided by the scientific community, but that it is sometimes spawned by circumstances and forces outside the pale of science. The idea that instruments are not neutral devices that serve science but elements that give structure to 13738
the scientific community first took root in studies of radio astronomy. This provocative concept was quickly extended to the sphere of high-energy physics at large (Krige 1996), oceanography (Mukerji 1992), and space science (Krige 2000). David Edge and Michael Mulkay (1976) first demonstrated that a scientific discipline, radio astronomy, which emerged in the 1950s and 1960s, was directly linked to or even defined by the design, construction, and diffusion of an altogether novel device: the radio telescope (itself an outgrowth of microwave technical research). The radio telescope discovered astronomical bodies and events. It contributed importantly to the birth of a new speciality, with its own university departments, journals, and national and international congresses. A new scientific instrument transformed knowledge, and it also affected the very institution of science. In a more speculative, even iconoclastic representation of scientific instruments, they are depicted as the key to research career success, and yet more assertively, as decisive motors in determining what is true in science. In some areas of science, the equipment crucial to carrying out telling experiments is extremely scarce due to the expense of acquisition and operation. By virtue of possessing a monopoly in an area of instrumentation, a scientist or laboratory can exercise control over the production of the best experimental data. Studies in this vein have been done for the fields of biology (Latour and Woolgar 1979) and physics (Pickering 1984, Pinch 1986). In this perspective Bruno Latour (1987) has suggested that scientific instruments yield not merely professional and institutional advantage, but more important, what is true and false, valid and invalid in science. A researcher’s dealings with instruments empower him or her to be heard and to be ‘right’ during scientific controversies. By dint of possessing a strategic apparatus, a laboratory is well positioned to establish what is and is not a sound claim. Latour insists that arguments and findings based on ‘weaker’ instruments, on mathematics, and on rational evaluation are a poor match against a truly powerful scientific instrument. Analyses of this sort are diametrically opposite to those of pre-World War II idealist historiography. A balanced, subtle, intellectual, and social contribution of instrumentation in the work of scientific research is found in the writings of Peter Galison. In How Experiments End (1987), this author argues that in twentieth-century microscopic experimental physics, instrument-generated signals are often crucial to settling rival claims, and he suggests that the work of separating noise from signals constitutes a key component in this process. In Image and Logic, Galison (1997) states that, contrary to what is often argued, scientific findings are not the outcome of interactions between theory and experimentation. He factors in a third element, namely instrumentation. Science thus derives from a triangular exchange between theory,
Scientific Instrumentation, History and Sociology of experimentation, and instrumentation. Galison speaks of a ‘trading zone’: a language and realm where these three currents merge, and where intelligibility is established. The brand of philosophical realism espoused by Ian Hacking (1983, 1989) likewise accords a central position to instrumentation. He insists that physical entities exist to the extent that instruments generate unarguably measurable effects. The classical example given by Hacking is the production of positrons whose presence induces palpable technical effects. Today, other science and technology studies identify instrumentation as an element which influences and sometimes structures the organization of work. Instruments are depicted as rendering obsolete some activities (functions) and some groups, as stimulating fresh functions, and as helping create the backdrop for organizational transformations. In the case of early big science, high-energy physics, highly specialized physicists were replaced in the task of particle detection and tracking by armies of low skilled women observers because of the emergence of alternative large-scale photographic technologies and protocols. In parallel, new technical roles, framed by new work arrangements, arose for engineering and technician cadres who were assigned to design or maintain novel devices or to assure effective interfaces between instrument packages. In a different sphere, the introduction of the first electronic calculators and computers near the end of World War II transformed occupations and work organization as they spelled the end of the human calculators (mostly women) who during the early years of the war contributed to the military’s high-technology programs. Due to the advent of the electronic computer in the late 1940s and 1950s, small horizontally organized work groups supplanted the former vastly bigger and vertically structured work system. Outside science too, for example in the military and industry, instrumentation is also currently viewed as having a structuring impact on the organization of labor. Control engineering, connected as it is with chains of cybernetic-based instrument systems, is depicted as having profoundly modified the organization of certain activities in military combat operations, fire control, and guidance. Other studies highlight instrumentation as a force behind changes in the composition and organization of many industrial tasks and occupations, through automation and robotics (Noble 1984, Zuboff 1988). These alter the size of a work force and its requisite skills, and the internal and external chains of hierarchy and command. Throughout the 1980s and 1990s, students of science and technology tracked and analyzed the development, diffusion, and implantation of devices in spheres increasingly distant from science and the laboratory. The concept ‘instrumentation’ took on a broader and different meaning from the initial historical and sociological concept of ‘scientific instrumentation.’
Studies of medical instrumentation led the way in this important transformation. Stuart Blume’s (1992) investigations of the emergence and diffusion of catscan devices, NMR imaging, and sonography illuminated the links between academic research, industrial research, the non-linear processes of industrial development of medical instrumentation, and the farflung applications of such apparatus. Blume’s and similar analyses had the effect of introducing significant complexities into the earlier fairly clear-cut notion of instrumentation. They blur the former understanding of a ‘scientific instrument’ by showing the multiple sites of its origins and development. They further show that an instrument may possess multiple applications, some in science and others in engineering, industry, metrology, and the military. One thing stands out with clarity; a scientific instrument is frequently coupled to industry in powerful ways: design, construction, diffusion, and maintenance. This link is historically situated in the late nineteenth and the twentieth centuries, and it appears to grow constantly in strength. This situation is pregnant with material and epistemological implications for science and beyond, as experiment design, laboratory work practices, reasoning processes, the organization of industry, and daily life are now all so interlocked with instrumentation. With only a few exceptions, however, little historical and sociological work has thus far concentrated on firms specifically engaged in the design, construction, and diffusion of instrumentation. It was not until the late nineteenth and early twentieth centuries that the military general staffs and politicians of certain countries (Germany, France, Great Britain, and somewhat later the US, the USSR, and Japan) began to perceive that their nation’s fate was tightly bound up with the quantity and quality of the instrumentation that endogenous industry could conceive and manufacture. In part because of this new consciousness, as well as the growth of scientific research, the number of companies involved in instrument innovation and production increased impressively. Mari Williams (1994) indicates that instrument companies must be thought of as key components in national systems of education, industry, government policy, and the organization of science. During much of the nineteenth century, France enjoyed the lead. England challenged the French instrument industry at the turn of the century, but by this time leadership clearly belonged to recently united Germany. Success in the scientific instrument industry appears to have been associated with tight organic bonds between specific firms and specific research laboratories, which was the case in England for companies close to the Cavendish. Germany’s immense success was the product of an organic association between the military, government, industrial manufacturing, and instrument making firms (Joerges and Shinn 2001). In the twentieth century, such proximity (not only geographic but perhaps more 13739
Scientific Instrumentation, History and Sociology of particularly in terms of working collaborations and markets) similarly proved effective in the United States. Does there exist an instrument-making or instrument-maker culture? This question too has received little attention, and any answer would have to be historically limited. Nevertheless, one often-cited study does address this issue, albeit indirectly. For a sample of post-World War II US instrument specialists, the sociologist Daniel Shimshoni (1970) looked at instrument specialists’ job mobility. He discovered that when compared to other similarly trained personnel, instrument specialists changed jobs more frequently than other categories of employees. However, the reasons behind this high mobility received scanty attention. One interpretation (not raised by Shimshoni) is that instrument specialists change employers in order to carry their instruments into fresh environments. Alternatively, once instrument specialists have performed their assigned tasks, perhaps employers encourage their departure from the firm, and instrument men are consequently driven to seek work elsewhere. One theme that has gained considerable attention during the 1980s and 1990s is the relationship between innovation and instrumentation. In an influential study, the sociologist of industrial organization and innovation Eric Von Hippel (1988) has explored the sites in which instrument innovations arise, and has examined the connections between those sites and the processes of industrial innovation. He indicates that a sizable majority of industrially relevant instrument novelties comes from inside industry, and definitely not from academia or from firms specialized in instrumentation. The instruments are most frequently the immediate and direct consequence of locally experienced technical difficulties in the realms of product design, manufacture, or quality control. Instrumentation often percolates laterally through a company, and is thus usually home used as well as home grown. Many devices do not move beyond the firm in which they originate. Hence, while in some instances dealings with academia and research may be part of industry practice, instrumentation is normally only loosely tied to science. When the connection between instrumentation development and science is strong, it is, moreover, often the case that industryspawned instrumentation percolates down to the science laboratory rather than academia-based devices penetrating industry. Some sociologists of innovation suggest that instrumentation in industry is linked to research and development, and through it to in-house technology and to company performance. According to some studies, during the 1970s and 1980s instrumentation correlated positively with industrial performance and company survival for US plants in a range of industrial sectors (Hage et al. 1993). Firms that exhibited small concern for advanced technology tended to stumble or 13740
close when compared with companies that actively sought new technologies. According to the Von Hippel hypothesis, an important fraction of innovative technology would take the form of in-house instrument-related innovation. The connection between academia, instrument innovation, and economic performance has been approached from a different perspective in a more narrowly researched and fascinatingly speculative study carried out by the economic historian Nathan Rosenberg (1994). Rosenberg suggests that many key instrument innovations having a great impact on industry spring from fundamental research conducted in universities. To illustrate this claim, he points to the university-based research on magnetic spin and resonance conducted by Felix Bloch at Stanford University in the late 1940s and 1950s. This basic physical research (theoretical and experimental) gave rise to nuclear magnetic resonance (NMR) and to NMR instrumentation; and in turn, NMR apparatus and related devices have spread outward in American industry, giving rise to new products and affecting industrial production operations. The best-known use of NMR instrumentation is in the area of medical imaging. Rosenberg suggests that the spillover effect of academic instrument research has been underestimated, and that the influence of academic research through instrumentation is characterized by a considerable multiplier effect. In the final analytical orientation indicated here, instrumentation is represented as a transverse epistemological, intellectual, technical, and social force that promotes a convergence of practices and knowledge among scientific and technological specialities. Instrumentation acts as a force that partly transcends the distinctions and differentiations tied to specific divisions of labor associated with particular fields of practice and learning. This transcending and transverse function is reserved to a particular category of instrumentation: ‘research-technology’ (Shinn 1993, 1997, 2000, Joerges and Shinn 2001). Research-technology is built around ‘generic’ devices that are open-ended general instruments. They result from basic instrumentation research and instrument theory. Examples include automatic control and registering devices, the ultracentrifuge, Fourier transform spectroscopy, the laser, and the microprocessor. Practitioners transfer their products into academia, industry, state technical services, metrology, and the military. By adapting their generic products to local uses, they participate in the development of narrow niche apparatus. The research-technologist operates in an ‘interstitial’ arena between established disciplines, professions, employers, and institutions. It is this in-between position that allows the research-technologist to design generalist non-specific generic devices, and then to circulate freely in and out of niches in order to diffuse them. Through this multi-audience diffusion, a form of
Scientific Knowledge, Sociology of practice-based universality arises, as a generic instrument provides a lingua franca and guarantees stable outcomes to multiple groups involved in a sweep of technical projects. Throughout the 1980s and 1990s studies devoted to instrumentation, or studies in which instrumentation plays a leading role, grew appreciably. The impressive number of instrument-related articles appearing in many of the major journals of the social studies of science and technology testify to the fact that the theme is now strong. Two general tendencies characterize today’s enquiries into instrumentation: first, a great diversity in the number of analytic currents and research schools which perceive instrumentation as basic to past and contemporary cognitive and organizational activities; and second, considerable variety in the spheres of activity that are putatively affected by instrumentation and in the mechanisms that allegedly underpin instrument influence. See also: Experiment, in Science and Technology Studies; History of Technology; Innovation, Theory of; Technological Innovation; Technology, Anthropology of
Bibliography Bachelard G 1933 Les intuitions atomistiques. Boivin, Paris Bachelard G 1951 L’actiiteT rationaliste de la physique contemporaine. Presses Universtaires de France, Paris Blume S 1992 Insight and Industry: On the Dynamics of Technological Change in Medicine. MIT Press, Cambridge, MA Edge D O, Mulkay M J 1976 Astronomy Transformed: The Emergence of Radio Astronomy in Britain. Wiley, New York Galison P 1987 How Experiments End. University of Chicago Press, Chicago Galison P 1997 Image and Logic: A Material Culture of Microphysics. University of Chicago Press, Chicago Hacking I 1983 Representing and Interening. Cambridge University Press, Cambridge, UK Hacking I 1989 The divided circle: a history of instruments for astronomy, navigation and surveying. Studies in the History and Philosophy of Science 20: 265–370 Hage J, Collins P, Hull F, Teachman J 1993 The impact of knowledge on the survival of American manufacturing plants. Social Forces. 72: 223–46 Heilbron J L, Seidel R W 1989 Lawrence and his Laboratory: A History of the Lawrence Berkeley Laboratory. University of California Press, Berkeley, CA Joerges B, Shinn T 2001 Instrumentation Between Science, State and Industry. Kluwer, Dordrecht, The Netherlands Krige J 1996 The ppbar project. In: Krige J (ed.) History of CERN. North-Holland, Amsterdam, Vol. 3 Krige J 2000 Crossing the interface from R&D to operational use: The case of the European meteorological satellite. Technology and Culture 41(1): 27–50 Latour B 1987 Science in Action: How to Follow Scientists and Engineers Through Society. Harvard University Press, Cambridge, MA Latour B, Woolgar S 1979 Laboratory Life: The Social Construction of Scientific Facts. Sage, Beverley Hills, CA
Mukerji C 1992 Scientific techniques and learning: Laboratory ‘signatures’ and the practice of oceanography. In: Bud R, Cozzens S (eds.) Inisible Connections: Instruments, Institutions and Science. SPIE Optical Engineering Press, Bellingham, WA Noble D F 1984 Forces of Production: A Social History of Industrial Automation, 1st edn. Knopf, New York Pickering A 1984 Constructing Quarks: A Sociological History of Particle Physics. University of Chicago Press, Chicago Pinch T J 1986 Confronting Nature: The Sociology of Solarneutrino Detection. Reidel, Dordrecht, The Netherlands Rosenberg N 1994 Exploring the Black Box: Technology, Economics, and History. Cambridge University Press, Cambridge, UK Shimshoni D 1970 The mobile scientist in the American instrument industry. Minera 8(1): 58–89 Shinn T 1993 The Bellevue grand electroaimant, 1900–1940: Birth of a research-technology community. Historical Studies in the Physical Sciences 24: 1, 157–87 Shinn T 1997 Crossing boundaries: The emergence of research technology communities. In: Leydesdorf L, Etzkowitz H (eds.) Uniersities and the Global Knowledge Economy: A Triple Helix of Uniersity–Industry–Goernment Relations. Cassel Academic Press, London Shinn T 2000 Formes de division du travail scientifique et convergence intellectuelle (Forms of division of scientific labor and intellectual convergence). Reue Francm aise de Sociologie 41: 447–73 Von Hippel E 1988 The Sources of Innoation. Oxford University Press, Oxford, UK Williams M E 1994 The Precision Makers: A History of the Instrument Industry in Britain and France 1870–1939. Routledge, London Zuboff S 1988 In the Age of the Smart Machine: The Future of Work and Power. Basic Books, New York
T. Shinn
Scientific Knowledge, Sociology of 1. Origins The central concern of the sociology of knowledge is the idea that what people take to be certain is an accident of the society in which they are born and brought up. It is obvious that religious or political truths are largely affected by their social settings. If the truths are moral then such relativism is more troubling. But if the truths are scientific, then the idea of social determination is widely regarded as subversive, or self-defeating—since the science which purports to show that science is socially situated is itself socially situated. The sociology of knowledge, as formulated by Mannheim (1936), avoided these dangerous and dizzying consequences by putting scientific and mathematical knowledge in a special category; other kinds of knowledge had roots in society but the knowledge of the natural sciences was governed by nature or logic. Scientific method, then, when properly applied, would insulate scientists from social influences and their knowledge should be more true and more 13741
Scientific Knowledge, Sociology of universal than other kinds of knowledge. From the early 1970s, however, groups of sociologists, philosophers, historians, and other social scientists began programs of analysis and research which treated scientific knowledge as comparable with other kinds of knowledge. Their work broke down the barrier between ordinary knowledge and scientific knowledge. The post-1970s Sociology of Scientific Knowledge (SSK), which included major contributions from historians of science, shows us how to understand the impact of social influences on all knowledge. We can use the field’s own approach to query its intellectual origins. A suitable model is a well-known paper by a historian, Paul Forman (1971), who argued that the rise of quantum theory owed much to the ‘Weltanschauung’ of Weimar Germany. In the same way it could be argued that the 1960s saw the growth of postwar prosperity in Europe which allowed the development of new markets with young people becoming powerful consumers of goods and producers of culture—much of it consisting of a rebellion against traditional forms of expression; it saw the development of the birth control pill, making possible a (temporary) breakdown of sexual mores; and it saw experiments with perception-affecting drugs, while protests against the Vietnam war threatened old lines of institutional authority. The new order had its counterpart in academe. Antipsychiatry questioned the barriers between the sane and the insane, while in Europe at least, romanticized versions of Marxism dominated debate. SSK’s questioning of the traditional authority of science can plausibly be seen as a product of this social ferment. The emblematic book of the period as far as science was concerned was Thomas Kuhn’s (1962) The Structure of Scientific Reolutions. Kuhn’s book was influential in helping to reinforce the intellectual atmosphere in which radical ideas about science could flourish. Kuhn’s notion of the incommensurability of paradigms, and paradigm revolution, provided a way of thinking about scientific knowledge as a product as much of culture as of nature. The book was not widely discussed until late in the 1960s, when the new social movements were at their height, and this adds force to the argument. Philosophy of science in the early 1970s was also affected by this turn. The so-called ‘Popper–Kuhn’ debate (Lakatos and Musgrave 1970) pitted Karl Popper’s (e.g., 1959) notion that scientific theories could be falsified with near certainty, even if they could not be proved, against Thomas Kuhn’s (1961, 1962) claim that what was taken to be true or false varied in response to sudden revolutions in scientific world view. The ‘Duhem–Quine thesis’ (Losee 1980) showed that scientific theories were supported by networks of observations of ideas such that no one observation could threaten any other part of the network directly. Imre Lakatos’s ([1963]\1976) brilliant analysis of the history of Euler’s theorem showed 13742
that falsification of a theory was only one choice when faced with apparently recalcitrant observations. Lakatos’s work was particularly attractive to the social analyst of science because it dealt with the detailed history of a real case rather than arising from general principles. Lakatos was showing how mathematicians argued (or could have argued in principle). This was the world of philosophical activity into which the sociology of scientific knowledge was born. Like any other group of knowers, sociologists of scientific knowledge prefer models of the genesis of their own ideas which reflect their self-image as ‘rational creatures’ or, in Mannheim’s phrase, ‘free floating intellectuals.’ In these models Kuhn plays a much smaller role. There are two intellectual stories. One turns on Ludwik Fleck’s ([1935]\1979) Genesis and Deelopment of a Scientific Fact, which Kuhn cites in the preface of his 1962 book, and which already contained many of the key ideas found therein. Furthermore, Fleck’s book anticipated the developments in sociology of scientific knowledge which came in the 1970s in that, unlike The Structure of Scientific Reolutions, it included a case study of a contemporaneous scientific episode, Fleck’s own research on the Wasserman reaction for the diagnosis of syphilis. Indeed, Fleck’s book remains unique in sociology of science in that his social analysis was of a pioneering piece of scientific research which he himself was pioneering. Thus it is the most perfect piece of participant observation yet done in the social study of scientific knowledge. Fleck’s book, however, was not widely known until many years after Kuhn’s was published, and not until long after the sociology of scientific knowledge had established its own way of doing things. Fleck, then, was the first to set out some of the crucial ideas for SSK, but he had little influence on the new movement except through his influence on Kuhn, while Kuhn helped open up the social and intellectual space for what came after rather than providing intellectual foundations or methodological principles; in other words, Kuhn provided the intellectual space but not, to use his own term, the paradigm. The origin of SSK’s intellectual and methodological paradigm has to be understood by looking at more direct influences. In the 1970s a group of philosophers, concerned with defending science against what they perceived as its critics, put forward the argument that sociological analysis could be applied only to knowledge that was in error, whereas true knowledge remained insulated from social forces. The intellectual source for the sociologists who inspired these initial attacks and fought against them was yet another philosopher, Ludwig Wittgenstein, whose work was central to the anthropological debate about whether there were universal standards of rationality (Wilson 1970). Readings of Ludwig Wittgenstein’s later philosophy (1953, 1956, Winch 1958) were the crucial theoretical resource for the work in sociology of
Scientific Knowledge, Sociology of scientific knowledge that had its origins in the mid-1970s. The ‘Edinburgh School’ and its ‘strong program’ argued on philosophical grounds that ‘symmetrical analysis’—analysis that treated true knowledge and false knowledge equally—was possible. Historical studies conducted in this framework showed how social influence affected scientific conclusions, in science in general, in early studies of the brain, in the development of statistics, and in high-energy physics (Bloor 1973, 1983, Barnes 1974, Shapin 1979, MacKenzie 1981, Pickering 1984). Independently, a group at the University of Bath applied Wittgensteinian ideas to contemporary episodes of science, particularly controversies in physics and the fringe sciences, and developed the methodology of interviewing at networks of competing scientific centers. Its program became known as the ‘Empirical Programme of Relativism’ (EPOR; Collins 1975, 1985, Pinch 1977, 1986, Travis 1980, 1981). These approaches were to become quite widely adopted in the early years of SSK. As the 1970s turned into the 1980s other groups, more strongly influenced by anthropological practice, entered and influenced the field. The Edinburgh–Bath approach was influenced by anthropology in that much of the discussion of the meaning of Wittgensteinian philosophy for the social sciences turned on questions to do with the knowledge of isolated ‘tribes’ and its relationship to Western knowledge. The newer contributors, however, took their methodological approach as well as their philosophy from anthropology. Their work founded the tradition of ‘laboratory studies.’ In the Strong Program the source of material tended to be historical archives; in EPOR the method was interviews with members of the ‘coreset’ of scientific laboratories contributing to a scientific controversy; in laboratory studies the method was a long sojourn in, or deep study of, a single laboratory (Latour and Woolgar 1979, Knorr Cetina 1981). The setting for the laboratory studies tended to be the life sciences rather than the hard sciences or fringe sciences. Another important input was the interpretive tradition in sociology associated with phenomenology and ethnomethodology. Indeed the term ‘social construction’ is a widely used description of the approach: ‘social construction of scientific knowledge’ was taken from the title of a well-known book by Peter Berger and Thomas Luckman (1967)—though Berger’s Initation to Sociology (1963) probably had more direct impact on the field. Ethnomethodology strongly influenced all those who practiced SSK and especially the detailed analyses of scientific practice such as those by Lynch (1985 [but written much earlier than its publication date]), Woolgar (1976), and especially the move of some authors into what became known as ‘discourse analysis’ (Knorr Cetina and Mulkay 1983). As the 1980s turned into the 1990s the concentration
on language that began with the move to discourse analysis helped SSK to come to seem a part of the much larger movement known as ‘postmodernism.’ The influence of French philosophers of literature and culture such as Derrida became marked. Since then ‘cultural studies of science,’ which share with SSK what could broadly be called a ‘social constructivist’ approach to science, have attracted large numbers of followers in the humanities as well as the social sciences. The respective methods and, more especially, the attitudes to methodology, make it possible to separate cultural studies from SSK. Crucially, SSK practitioners still take natural science as the touchstone where matters of method are concerned, though the model of science is very far from the narrow statistically-based notion of science that informed the ‘scientific sociology’ of the 1950s and early 1960s and continues to dominate much mainstream sociological practice in the USA. SSK stresses the more general values of careful empirical observation and repeatability. Cultural studies, on the other hand, takes philosophical literary criticism, or semiotics, as the model to be followed.
2. Concepts Among sociology’s traditional topics was the study of the social factors which affected scientists’ choice of research topic. The sociology of scientific knowledge, however, took it that the outcomes of research projects were also affected by their social setting. SSK showed that theoretical and experimental procedures did not determine, or fully determine, the conclusions of scientific research. The philosophical ideas described above provided an important starting point, but sociology developed its own concepts set in the working world of the scientist rather than the abstract world of the philosopher. One strand of research showed how difficult it was to transfer the skills of physics experimentation between settings, that it was even harder to know that such skills had been transferred, and that it was harder still to know when an experiment had been satisfactorily completed. This meant we did not know whether a negative experiment should be taken to contradict a positive one. This argument became known as ‘the experimenter’s regress’ (Collins 1975, 1985), and revealed the scope for ‘interpretive flexibility’ regarding the outcomes of passages of experimentation and theorization. It also means that even while SSK stresses the importance of empirical research and repeatability, it is aware that, by themselves, such procedures cannot turn disputed knowledge into certainty in either the natural or social sciences. Even if experimental science produced widely accepted data, as Pinch (1981) showed, their ‘evidential significance’ could vary and the same finding could count as revolutionary or insignificant. Sociological studies also described the process between the dis13743
Scientific Knowledge, Sociology of ordered activities of the laboratory and the orderly ‘findings’ (Latour and Woolgar 1979, Knorr Cetina 1981) or ‘discoveries’ (Brannigan 1981) reported—or constructed—in the published literature. Latour and Woolgar demonstrated the way that ‘modalities’— phrases that qualified a finding or referred to the particular time and place of its generation—were successively stripped from publications as the finding became established. What they called ‘inversion’ was the way the stripping of the modalities, and like processes, comprised the establishment of the finding—a reversal of the accepted direction of the causal arrow. It was also shown that, with some exceptions, ‘distance lends enchantment,’ that is, certainty about scientific outcomes tends to increase as the intepreter’s distance from the laboratory increases (Collins 1985, MacKenzie 1998). This explained why commentators’ accounts often seemed far more confident than the reports of the scientists themselves, and explained much about the way scientific knowledge was diffused beyond the specialists. Mechanisms were described by which potentially open-ended disputes were brought to a close around a particular interpretation, with EPOR concentrating on fringe sciences, the ‘French School’ concentrating on the interplay of actors within networks (Latour 1987), and the ‘Edinburgh School’ concentrating on large-scale political influences on the content of ideas (Shapin 1979, MacKenzie 1981, Shapin and Schaffer 1987). More recently, work on closure has broadened to include studies in the social construction of technology, the public understanding of science and law and science.
3. Significance As the recent so-called ‘science wars’ have revealed, outsiders consider that the importance of SSK and related analyses of science lies in the impact they have on the perception of the meaning of science in the wider world. SSK is widely seen as a radical relativist attack on science. But both the philosophical and political radicalism of the social analysis of science varies from program to program and study to study. For example, ‘epistemological relativism’ implies that one social group’s way of justifying its knowledge is as good as another’s and that there is no external vantage point from which to judge between them; all that can be known can be known only from the point of view of one social group or another. Ontological relativism is the view that, in social groups such as those described above, reality itself is different. A combination of epistemological and\or ontological relativism can be referred to as ‘philosophical relativism.’ This attitude is nicely captured by McHugh in the following quotation: ‘We must accept that there are no adequate grounds for establishing criteria of truth except the grounds that are employed to grant or concede it—truth is conceivable only as a socially 13744
organized upshot of contingent courses of linguistic, conceptual and social behaviour’ (1971, p. 329). Philosophical relativism is, then, a philosophically radical viewpoint. A still more philosophically radical position is the ‘actor (or actant) network theory’ (ANT) most closely associated with Michel Callon and Bruno Latour. The approach was first signaled by Latour and Woolgar when, in 1986, they changed the subtitle of Laboratory Life (Latour and Woolgar 1979) from The Social Construction of Scientific Facts to The Construction of Scientific Facts so as to signify that the ‘social’ in their view no longer deserved special mention in the shaping of scientific knowledge. Subsequently, as Callon and Latour developed their ideas, scientific facts came to be seen as emerging from the interplay of ‘actants’ in the ‘text,’ of life—suggesting that ANT should most properly be included under cultural studies of science rather than SSK. Within Callon and Latour’s ‘text,’ terms such as ‘social’ and ‘natural’ are themselves outcomes of the interplay of actants rather than original causes. Therefore, to draw attention to the social as a special factor is to begin the discussion at too low a level of generality. Under this approach ‘constructivism’ has ceased to be social and among the actants nonhumans are given as much reality-forming agency as humans (Callon 1986). The philosophical radicalness of ANT is clear in its refusal to accept even the notion of human and nonhuman as primary. Methodological relativism, by contrast, says nothing direct about reality or the justification of knowledge. Methodological relativism is an attitude of mind recommended by some of those who practice SSK; it says that the sociologist or historian of science should act as though the beliefs about reality of any competing groups being investigated are not caused by the reality itself. The intention is to limit analysis to the kind of causes of scientific beliefs that are located in the domain of the social. Methodological relativism is meant to push exploration of the social causes of belief to the limit without having it cut off by the argument that a belief was widely accepted because it was rational. Methodological relativism is, then, not at all radical as a philosophy, it is a (social) scientific method. Relativism may be called time-bounded if it claims only that scientific procedures are often inconclusive in the short term whatever the deeper significance of the consensuses that are attained in the long term. This kind of relativism, which has no philosophical potency, is all that is needed to support much current social analysis of contemporary science, especially controversies that impact on the public domain (e.g., Collins and Pinch 1993, Richards 1991, Epstein 1996). The relationship between philosophical radicalism and political radicalism is, however, perverse. Time-bounded relativism has consequences for contemporary scientific controversies such as engage
Scientific Knowledge, Sociology of environmentalists and the like: it has this power because it shows that, contrary to the way these things are treated in textbooks and the like, the ‘short-term’ period when scientific disputes are being resolved is often many decades long. The empirical and analytic tools of SSK explain why this is likely to be so and it means that we should not expect speedy science-based resolutions to our current technological dilemmas. From this it can be argued that the decision-making power once monopolized by scientific experts should be more widely shared (Wynne 1987). Methodological relativism, which is philosophically quietist, tends to be disturbing to the scientific community as it blatantly ignores scientific consensus however well established. To go the other extreme, ANT, the most philosophically radical position of all, by removing humans from the central position they hold in social constructivism, recreates a relationship between scientists and their nonhuman objects of study which is similar to that which held before the revolutions of the 1960s and 1970s (Collins and Yearley 1992, Callon and Latour 1992, Fuller 2000). What about philosophical relativism? Scientists and philosophers seem to believe that philosophical relativism can be used to justify nonscientific beliefs such as astrology or creationism, though its proponents deny it fervently, and at least one critic (Fuller 2000) has argued that it is politically quietist. SSK, of course, argues that as knowledge moves from its seat of creation to a wider audience, it tends to be stripped down and simplified, so it is not surprising that the subtleties of the various relativist positions have escaped the more outspoken critics of the social analysis of scientific knowledge.
Bibliography Barnes B S 1974 Scientific Knowledge and Sociological Theory. Routledge and Kegan Paul, London Barnes B S, Bloor D, Henry J 1996 Scientific Knowledge: A Sociological Analysis. Athlone Press, London Berger P L 1963 Initation to Sociology. Anchor Books, Garden City, NY Berger P L, Luckman T 1967 The Social Construction of Reality. Allen Lane, London Bloor D 1973 Wittgenstein and Mannheim on the sociology of mathematics. Studies in the History and Philosophy of Science 4: 173–91 Bloor D 1983 Wittgenstein: A Social Theory of Knowledge. Macmillan, London Brannigan G 1981 The Social Basis of Scientific Discoeries. Cambridge University Press, New York Callon M 1986 Some elements of a sociology of translation: Domestication of the scallops and the fishermen of St Brieuc Bay. In: Law J (ed.) Power, Action and Belief: A New Sociology of Knowledge? Routledge & Kegan Paul, London, pp. 196–233 Callon M, Latour B 1992 Don’t throw the baby out with the bath school! In: Pickering A (ed.) Science as Practice and Culture. University of Chicago Press, Chicago, pp. 343–68
Collins H M 1975 The seven sexes: A study in the sociology of a phenomenon, or the replication of experiments in physics. Sociology 9(2): 205–24 Collins H M 1985 Changing Order: Replication and Induction in Scientific Practice. Sage, Beverly Hills & London [2nd edn., University of Chicago Press, 1992] Collins H M, Pinch T J 1993 The Golem: What Eeryone Should Know About Science. Cambridge University Press, Cambridge & New York [subsequent editions, 1994, 1998] Collins H M, Yearley S 1992 Epistemological chicken. In: Pickering A (ed.) Science as Practice and Culture. University of Chicago Press, Chicago, pp. 301–26 Epstein S 1996 Impure Science: AIDS, Actiism and the Politics of Knowledge. University of California Press, Berkeley, Los Angeles & London Feyerabend P K 1975 Against Method. New Left Books, London Fleck L 1979 Genesis and Deelopment of a Scientific Fact. University of Chicago Press, Chicago [first published in German in 1935] Forman P 1971 Weimar culture, causality and quantum theory, 1918–1927: Adaptation by German physicists and mathematicians to a hostile intellectual environment. In: McCormmach R (ed.) Historical Studies in the Physical Sciences, No 3. University of Pennsylvania Press, Philadelphia, PA, pp. 1–115 Fuller S 2000 Thomas Kuhn: A Philosophical History for our Times. University of Chicago Press, Chicago Knorr Cetina K D 1981 The Manufacture of Knowledge. Pergamon, Oxford, UK Knorr Cetina K D, Mulkay M (eds.) 1983 Science Obsered: Perspecties on the Social Study of Science. Sage, London & Beverley Hills Kuhn T S 1961 The function of measurement in modern physical science. ISIS 52: 162–76 Kuhn T S 1962 The Structure of Scientific Reolutions. University of Chicago Press, Chicago Lakatos I 1976 Proofs and Refutations. Cambridge University Press, Cambridge [originally published in British Journal for the Philosophy of Science 1963 XIV: 1–25, 120–39, 221–45, 296–342] Lakatos I, Musgrave A (eds.) 1970 Criticism and the Growth of Knowledge. Cambridge University Press, Cambridge, UK Latour B 1987 Science in Action. Open University Press, Milton Keynes, UK Latour B, Woolgar S 1979 Laboratory Life: The Social Construction of Scientific Facts. Sage, London & Beverly Hills [2nd edn., 1986] Losee J 1980 A Historical Introduction to the Philosophy of Science. Oxford University Press, Oxford, UK Lynch M 1985 Art and Artifact in Laboratory Science: A Study of Shop Work and Shop Talk in a Research Laboratory. Routledge and Kegan Paul, London McHugh P 1971 On the failure of positivism. In: Douglas J D (ed.) Understanding Eeryday Life. Routledge and Kegan Paul, London, pp. 320–35 MacKenzie D 1981 Statistics in Britain 1865–1930. Edinburgh University Press, Edinburgh, UK MacKenzie D 1998 The certainty trough. In: Williams R, Faulkner W, Fleck J (eds.) Exploring Expertise: Issues and Perspecties. Macmillan, Basingstoke, UK Mannheim K 1936 Ideology and Utopia: An Introduction to the Sociology of Knowledge. University of Chicago Press, Chicago Pickering A 1984 Constructing Quarks: A Sociological History of Particle Physics. Edinburgh University Press, Edinburgh, UK
13745
Scientific Knowledge, Sociology of Pinch T J 1977 What does a proof do if it does not prove? In: Mendelsohn E, Weingart P, Whitley R (eds.) The Social Production of Scientific Knowledge. Reidel, Dordrecht, The Netherlands Pinch T J 1981 The sun-set: The presentation of certainty in scientific life. Social Studies of Science 1(11): 131–58 Pinch T J 1986 Confronting Nature: The Sociology of Solarneutrino Detection. Reidel, Dordrecht, The Netherlands Popper K R 1959 The Logic of Scientific Discoery. Harper & Row, New York Richards E 1991 Vitamin C and Cancer: Medicine or Politics. Macmillan, Basingstoke, UK Shapin S 1979 The politics of observation: Cerebral anatomy and social interests in the Edinburgh Phrenology Disputes. In: Wallis R (ed.) On the Margins of Science: The Social Construction of Rejected Knowledge, Sociological Reiew Monograph, 27. Keele University Press, Keele, UK, pp. 139–78 Shapin S, Schaffer S 1985 Leiathan and the Air Pump: Hobbes, Boyle and the Experimenal Life. Princeton University Press, Princeton, NJ Travis G D L 1980 On the importance of being earnest. In: Knorr K, Krohn R, Whitley R (eds.) The Social Process of Scientific Inestigation: Sociology of the Sciences Yearbook, 4. Reidel, Dordrecht, The Netherlands, pp. 165–93 Travis G D L 1981 Replicating replication? Aspects of the social construction of learning in planarian worms. Social Studies of Science 11: 11–32 Wilson B (ed.) 1970 Rationality. Blackwell, Oxford, UK Winch P G 1958 The Idea of a Social Science. Routledge and Kegan Paul, London Woolgar S 1976 Writing an intellectual history of scientific developments: The use of discovery accounts. Social Studies of Science 6: 395–42 Wittgenstein L 1953 Philosophical Inestigations. Blackwell, Oxford, UK Wittgenstein L 1956 Remarks on the Foundations of Mathematics. Blackwell, Oxford, UK Wynne B 1987 Risk Management and Hazardous Wastes: Implementation and the Dialectics of Credibility. Springer, Berlin
H. M. Collins
Scientific Reasoning and Discovery, Cognitive Psychology of The cognitive psychology of scientific reasoning and discovery refers to the study of the cognitive processes that scientists use in all aspects of science. Researchers have used interviews and historical records, cognitive experiments on components of scientific thinking, computational models based on particular scientific discoveries, and investigations of scientists as they reason live, or ‘in io,’ in an effort to uncover the thinking and reasoning strategies that are important in science. In this article, six different approaches to scientific reasoning are discussed. One important point to note is that scientific thinking builds upon many different cognitive components such as induction, deduction, analogy, problem solving, priming, and 13746
categorization, that are objects of study in their own rights. Research specifically concerned with scientific thinking tends to use content domains that are from an established domain of science (such as physics, biology, or chemistry), or looks at how different cognitive processes such as concepts and deduction are used together in areas like experiment design. Useful books on the nature of scientific thinking are Tweeney et al. (1982), Giere (1992), and Klahr et al. (2000).
1. Interiews and the Historical Record Two frequently used and related approaches that have been used to investigate scientific thinking have been interviews with scientists and the analysis of historical records and documents such as notebooks. One of the earliest accounts of scientific thinking and reasoning was the interview of Albert Einstein conducted by the Gestalt Psychologist Max Wertheimer (1959). Wertheimer argued that a key strategy used by Einstein was to search for invariants. Wertheimer saw the velocity of light as a key invariant around which Einstein built his theory. Wertheimer incorporated his analysis into a Gestalt theory of thought. More recently, researchers have conducted interviews in the context of principles from cognitive science. For example, Paul Thagard (1999) has conducted many interviews with the scientists who proposed that ulcers are caused by bacteria. Thagard has pointed to the important roles of serendipity, observation, and analogy in this discovery. A related line of inquiry is the use of historical documents. Using the scientists’ lab books, biographical, and autobiographical materials, researchers attempt to piece together the reasoning strategies that the scientists used in making a discovery. For example, Nersessian (1992) has conducted extensive analyses of the physicist Faraday’s notebooks and has argued that the key to understanding his discoveries is in terms of his use of mental models. By mapping out the types of mental models that Faraday used and showing how these types of models shaped the discoveries that Faraday made, Nersessian offered a detailed account of the mental processes that led to a particular discovery. Another cognitive approach using the historical record is to take a real scientific discovery, such as Monod and Jacob’s Nobel Prize-winning discovery of a mechanism of genetic control and give people the same problem, using a simulated scientific laboratory, and determine whether people use the same discovery strategies that the scientists used to make the discovery, such as focusing on unexpected findings (Dunbar 1993).
2. Scientific Reasoning as Problem Soling and Concept Formation Two common approaches to scientific thinking have been to see it as a way of discovering new concepts or
Scientific Reasoning and Discoery, Cognitie Psychology of as a form of problem solving. Beginning with Bruner et al.’s (1956) classic experiments in which college students were asked to induce the rule that determines whether an item is, or is not, a member of a category, these researchers attempted to discover the types of inductive reasoning strategies used to acquire new concepts. Bruner et al. argued that much of science consists of inducing new concepts from data and that the memory loads that different strategies require will make certain types of inductive reasoning strategies more common than others. More recently, Holland et al. (1986) provided an account of the different inductive learning procedures that could be used to acquire new concepts in science. Herbert Simon (1977) argued that concept formation is a form of problem solving, thus the two approaches can be seen as complimentary (see Klahr et al. 2000). Simon argued that scientific thinking consists of a search in a problem space with the two main spaces being a hypothesis space and an experiment space. The hypothesis and experiment spaces refer to all possible hypotheses, experiments, and operators that can be used to get from one part of the space to the next part of the space, such as grouping common elements (the grouping operator) in sets of results to form a new hypothesis. Simon has taken specific scientific discoveries and mapped out the types of heuristics (or strategies), such as heuristics for designing experiments that a scientist used in searching the experiment space. Using the notion of searching in a problem space, other researchers have analyzed the types of search heuristics that are used in all aspects of scientific thinking and have conducted experiments on the problem-solving heuristics that people use in designing experiments, formulating hypotheses, and interpreting results (Dunbar 1993, Klahr et al. 2000). These approaches specify the types of knowledge that an individual must possess and the heuristics that are used to formulate hypotheses, design experiments, and interpret data.
3. Errors in Scientific Thinking One of the most frequently investigated aspects of scientific thinking and reasoning has been the finding that both scientists and participants in psychology experiments attempt to confirm their hypothesis when they conduct an experiment, sometimes called ‘confirmation bias’ (see Tweeney et al. 1982). Following from the writings of the philosopher Karl Popper, many psychologists have assumed that attempting to confirm a hypothesis is a faulty reasoning strategy. Numerous studies have revealed that, when given an hypothesis to test, people will design experiments that will confirm their hypothesis and not conduct experiments that could falsify their own hypothesis. This is a pervasive phenomenon that is difficult to eradicate; even when given instructions to falsify hypotheses,
people find it difficult to do. Thus, many researchers have concluded that both people in psychology experiments as well as scientists at large make this faulty reasoning error. However, Klayman and Ha (1987) argued that conducting experiments that confirm a hypothesis is not necessarily a scientific reasoning error. They argued that if the prior probability of confirming one’s hypothesis is low, then even if the scientist is attempting to confirm the hypothesis, it can still be disconfirmed. One other interpretation of the phenomenon of confirmation bias is that early in developing a theory or a hypothesis people will attempt to confirm the hypothesis; however once the hypothesis is fleshed out and confirmed, people will attempt to conduct disconfirming experiments (see Tweeney et al. 1980).
4. Science ‘In io’: How Scientists Think in Naturalistic Contexts One important issue in scientific reasoning and discovery is that most accounts have tended to use indirect evidence such as lab notebooks, biographies, and interviews with scientists to determine the thinking and reasoning strategies that scientists use. Another approach is to conduct experiments on isolated aspects of scientific thinking. See Dunbar (1995) for an analysis of these standard approaches that he has termed ‘in itro.’ Both approaches, while very informative, do not look at scientists directly. Thus, a complimentary approach has been to investigate real scientists’ thinking and reasoning strategies while they are conducting real research. Using this ‘in io’ approach, Dunbar (1999) has identified the specific ways that scientists use analogies, deal with unexpected findings, and use collaborative reasoning strategies in their research. He found that scientists use analogies to similar entities (or ‘local analogies’) when fixing experimental problems, analogies to entities from the same class of items (or ‘regional analogies’) when formulating new hypotheses, and analogies to very dissimilar domains (‘long-distance analogies’) when explaining scientific issues to others. Furthermore, Dunbar found that over half the findings that scientists obtain are unexpected, and that the scientists have specific strategies for dealing with these unexpected findings: First, scientists provide methodological explanations using local analogies that suggest ways of changing their experiments to obtain the desired result. If the changes to experiments do not provide the desired results, then the scientists switch from blaming the method to formulating hypotheses; this involves the use of ‘regional analogies,’ as well as collaborative reasoning in which groups of scientists build models and theories together. Dunbar has further brought back these ‘in io’ findings into the cognitive laboratory to conduct controlled experi13747
Scientific Reasoning and Discoery, Cognitie Psychology of ments, which, taken together, have been used to build new accounts of the ways that analogy, collaborative reasoning, and causal reasoning are used in scientific thinking (Dunbar 1999).
5. The Deelopment of Scientific Thinking Skills Beginning with the work of Piaget, many researchers have noted that children are similar to scientists. This ‘child-as-scientist’ metaphor has two main strands. First, children’s acquisition of new concepts and theories is said to be similar to the large conceptual changes that occur in scientific fields. Researchers investigating this view have pointed to parallels between changes in children’s concepts, such as their concepts of heat and temperature, and changes in the concepts of heat and temperature in the history of physics (see Chi 1992, Carey 1992). The second strand of the child-as-scientist metaphor is that children reason in identical ways to scientists, ranging from deduction and induction to experimental design. Some researchers have argued that there is little difference between a scientist and a three-year-old; while scientists clearly have considerably more knowledge of specific domains than children, their underlying competencies are viewed as the same. Other researchers have taken this ‘child-as-scientist’ view even further and argued that infants are basically scientists. Yet other researchers have argued that there are fundamental differences between children and scientists and that scientific thinking skills follow a developmental progression (see Klahr et al. 2000 for an overview of this debate).
6. Cognitiely Drien Computational Discoery: Twenty-first Century Scientific Discoery One development that has occurred in many sciences is the placing of vast amounts of information in computer databases. In the year 2000 the entire human genome was mapped and now exists in databases. Similar developments have occurred in physics, where the entire universe has been mapped and put on a database. In the case of the human genome, data consists of long sequences of nucleotides, each of which is represented by a letter such as A for Adenine. Databases consist of strings of letters such as ATGTC with each letter representing a particular nucleotide. These strings or sequences can extend for hundreds of millions of nucleotides without interruption. Buried in these sequences are genes, and most of the genes are of unknown function. One goal of researchers is to search the database, find genes, and determine the function of the genes and how the genes interact with each other. This new wave of scientific investigation incorporates many of the principles of analogical reasoning, inductive reasoning and problem-solving strategies dis13748
cussed here, as well as many of the algorithms (neural nets, Markov models, and production systems) discovered by cognitive psychologists in the 1980s and 1990s to discover the functions of genetic sequences and the properties of certain types of matter in the universe. One interesting aspect of this development has been that rather than computer programs being expected to make an entire discovery from beginning to end, they are a tool that can be used by scientists to help make a discovery. The cognitive psychology of scientific reasoning has moved from being a description of the scientific mind to an active participant in scientific practice. See also: Discovery Learning, Cognitive Psychology of; History of Science; Informal Reasoning, Psychology of; Piaget’s Theory of Child Development; Problem Solving and Reasoning: Case-based; Problem Solving and Reasoning, Psychology of; Problem Solving: Deduction, Induction, and Analogical Reasoning; Reasoning with Mental Models; Scientific Concepts: Development in Children
Bibliography Bruner J S, Goodnow J J, Austin G A 1956 A Study of Thinking. Wiley, New York Carey S 1992 The origin and evolution of everyday concepts. In: Giere R N (ed.) Minnesota Studies in the Philosophy of Science. Vol. XV: Cognitie Models of Science. University of Minnesota Press, Minneapolis, MN, pp. 89–128 Chi M 1992 Conceptual change within and across ontological categories. Examples from learning and discovery in science. In: Giere R N (ed.) Minnesota Studies in the Philosophy of Science. Vol. XV: Cognitie Models of Science. University of Minnesota Press, Minneapolis, MN, pp. 129–86 Dunbar K 1993 Concept discovery in a scientific domain. Cognitie Science 17: 397–434 Dunbar K 1995 How scientists really reason: Scientific reasoning in real-world laboratories. In: Sternberg R J, Davidson J E (eds.) Mechanisms of Insight. MIT Press, Cambridge, MA, pp. 365–95 Dunbar K 1999 The scientist in vivo: How scientists think and reason in the laboratory. In: Magnani L, Nersessian N, Thagard P (eds.) Model-based Reasoning in Scientific Discoery. Kluwer Academic\Plenum Publishers, New York, pp. 89–98 Giere R N (ed.) 1992 Minnesota Studies in the Philosophy of Science. Vol. XV: Cognitie Models of Science. University of Minnesota Press, Minneapolis, MN Holland J H, Holyoak K J, Nisbett R E, Thagard P R 1986 Induction: Processes of Inference, Learning, and Discoery. MIT Press, Cambridge, MA Klahr D, Dunbar K, Fay A, Penner D, Schunn C 2000 Exploring Science: The Cognition and Deelopment of Discoery Processes. MIT Press, Cambridge, MA Klayman J, Ha Y W 1987 Confirmation, disconfirmation, and information in hypothesis testing. Psychological Reiew 94: 211–28 Nersessian N J 1992 How do scientists think? Capturing the dynamics of conceptual change in science. In: Giere R N (ed.) Minnesota Studies in the Philosophy of Science. Vol. XV:
Scientific Reolution: History and Sociology of Cognitie Models of Science. University of Minnesota Press, Minneapolis, MN Simon H A 1977 Models of Discoery: and Other Topics in the Methods of Science. D Reidel Publishing, Dordrecht, The Netherlands Thagard P 1999 How Scientists Explain Disease. Princeton University Press, Princeton, NJ Tweney R D, Doherty M E, Mynatt C R (eds.) 1981 On Scientific Thinking. Columbia University Press, New York Wertheimer M 1959 Productie Thinking. Harper and Row, New York
K. Dunbar
Scientific Revolution: History and Sociology of 1. The Term ‘Scientific Reolution’ The term ‘Scientific Revolution’ was introduced to denote what has been considered since the nineteenth century one of the most important discontinuities in the history of European science (H F Cohen 1994, I B Cohen 1985, Lindberg and Westman 1990). It covered roughly the period between Copernicus and Newton, and led from Aristotelian natural philosophy (see Aristotle (384–322 BC)) deriving its dogmatic authority from the Church to the establishment of the classical sciences and their institutions, i.e., the period between 1500 and 1700 (see Scientific Disciplines, History of Scientific Academies, History of). This period can be characterized as one in which a new social group, the ‘engineer-scientists’ (consisting of engineers, scientists, inventors, artists, and explorers), emerged and became institutionalized. This group confronted traditional natural philosophy with the challenges of practice and experience, but also engaged in self-contained explanations of natural phenomena, expecting that science is a means to master nature, as Francis Bacon put it. Among the lasting achievements of the Scientific Revolution were the establishment of heliocentric astronomy, classical mechanics, as well as numerous contributions to optics, chemistry, physiology and other areas of modern science (for an overview see Butterfield 1965, Dijksterhuis 1986, Hall 1954). Many of these achievements became, in fact, the basis for technological breakthroughs. However, these breakthroughs occurred, as a rule, much later than anticipated by the protagonists of the Scientific Revolution. Further, the intellectual breakthroughs responsible for the revolution’s lasting impact on the development of science were mainly attained towards its end and only a generation or more after their proclamation by the early pioneers (Damerow et al. 1992).
The Scientific Revolution has become a paradigmatic reference for all approaches to the history and philosophy of science (see History of Science). At least since the Enlightenment it has been conceived as the triumph of the scientific method over the irrationalism of religious beliefs (see Science and Religion). Opposition to this view, together with a growing professionalization of historical studies, has opened up the space for other accounts, including attempts—such as that by Pierre Duhem—to deny that the Scientific Revolution actually represented a radical break, claiming instead that it merely constituted an episode within a continuous accumulation of scientific knowledge since the Middle Ages (Duhem 1996). Furthermore, the traditional emphasis on the role of outstanding protagonists of the Scientific Revolution, such as Bacon, Galileo, and Descartes, and their individual ‘discoveries’ (see e.g., Koyre! 1978) has, in recent scholarship, increasingly receded in favor of an analysis of contexts tracing scientific achievements back to cultural, social, and economic conditions (Osler 2000, Porter and Teich 1992, Shapin 1996). The Scientific Revolution, on the other hand, itself provided a model for historical explanations when Thomas Kuhn (see Kuhn, Thomas S (1922–96)) radically challenged traditional historiography with his notion of ‘scientific revolution’ conceived of as a general structure of knowledge development (Kuhn 1962).
2. Historical and Sociological Context From a sociological perspective, the Scientific Revolution appears as part of a wider social process in which technical knowledge assumed a new role in the organization of European societies (see Renaissance; Enlightenment). This process took off in the late Middle Ages and was primarily rooted in the larger cities, which saw an ever more diversified and developed artisanal culture and a growing accumulation of merchandise capital. The cities of early modern Europe thus offered favorable conditions for the rapid growth of technical knowledge and the reflection of this growth in political, philosophical and religious thinking. Large-scale ventures involving technical expertise, such as projects of military architecture, water regulation, military adventures, or seafaring expeditions (see History of Technology), involved types of resources, social mobility, and an outlook on the world available only in urban centers such as Florence, Venice, Paris and London, which in fact became, long after they had attained an outstanding economic role, also the nuclei of the Scientific Revolution. Historians such as Edgar Zilsel (Zilsel 2000) have considered the early modern engineering projects as a decisive condition for the practical orientation and empirical knowledge base that distinguish the science of this period from its medieval antecedents. 13749
Scientific Reolution: History and Sociology of Beginning in the fifteenth century, ambitious practical ventures (such as the building of the cupola of Florence Cathedral, the search for a sea route to India, or the development of a new military technology) increasingly relied on expert knowledge comprising both logistic and technological competencies exceeding those of traditional practitioners and artisans. Such competencies could only be gained on the basis of broader reflection on the relevant practical and theoretical knowledge resources available. This reflection became the specialty of the new group of engineerscientists such as Filippo Brunelleschi, Christopher Columbus, Leonardo da Vinci, Niccolo Tartaglia, and Galileo Galilei. While the states of the time (if not actually at war) competed with each other in the pursuit of practical ventures, for example, building projects and seafaring expeditions, the knowledge acquired in such ventures was nevertheless constantly spread among the engineer-scientists employed by them. In the social fabric of the early modern period, engineer-scientists occupied a place similar to that earlier conquered by Renaissance humanists, administrators and artists (see Art History). These practically oriented intellectuals were, as a rule, highly mobile, offering their services to whatever patronage was available. At the same time, they constituted, as a social group, a collective memory, accumulating and transmitting the new knowledge long before appropriate institutions of learning emerged and in spite of the frequent political and military turnovers of the time. The characteristic features of the engineer-scientists and their work become understandable against the background of their uncertain social status and their dependence on the patronage of courts and city governments (see e.g., Biagioli 1993) with rapidly changing power structures. Examples are their incessant engagement with projects for potential future patrons; their usually unrealistic promises regarding the practical benefits of their theories, inventions or projects; the secretiveness with which they treated their discoveries; their frequent involvement in priority struggles; as well as the striving to ennoble their practical knowledge by the claim of creating ‘new sciences.’ The social and political ambitions of the engineerscientists were reflected in their pursuit of a literary culture of technical knowledge, largely emulating the humanist culture of the courts, including their reference to ancient Greek and Roman canons. Their contribution to the literary culture was in turn welcomed by those interested in challenging the transcendent, religious legitimization of the feudal order as an argument for the possibility of an immanent explanation of both the natural and the social world. This affinity, together with the fact that an allencompassing explanation of the world on the basis of Aristotelian philosophy had been adopted as the official doctrine of the Catholic Church, brought the 13750
engineer-scientists almost unavoidably into conflict with its power structures (Feldhay 1995). Other developments contributed to turning the growth of technological and scientific knowledge into a force driving profound changes in the established structures of European society. The invention of printing offered a revolutionary new means of dissemination that challenged the exclusiveness of a literary culture based on manuscripts. The new dissemination technique contributed to overcoming the traditional separation between various branches of practical knowledge, confined by a transmission process relying exclusively on participation and oral communication as well as restricted by guild regulations. It also bridged the social gulf between such practical knowledge and the theoretical knowledge transmitted via scholarly texts at universities, monasteries, and courts. As a result, knowledge resources of different provenance were integrated and became widely available. Together with the accumulation of new knowledge in the context of the great practical ventures of the time, this process formed part of a veritable explosion of knowledge, both in the sense of its expansion and of its spreading across traditional social barriers. In reaction to this knowledge explosion and its growing significance for the functioning of the advanced European societies, new institutions of learning such as the Accademia del Cimento (1657), the Royal Society (1660), and the Acade! mie Royale des Sciences (1666) emerged in the middle of the seventeenth century (see Knowledge Societies; Scientific Academies, History of). Traditional institutions, such as those of the Church, on the other hand, had to accommodate to the new situation, as may be illustrated by the prominent involvement of Jesuits in the scientific culture of the time (Wallace 1991). Parallel to this process of institutionalization, science had, towards the end of the period here under consideration, gradually emancipated itself from the expectation of immediate practical benefits and could increasingly be pursued for its own sake. In summary, as a result of the Scientific Revolution, not only the production and transmission of technological knowledge but also its representation by scientific theories became an essential factor in the social, economic, and cultural development of European societies. This is the context in which scientists such as Huygens, Leibniz, Hook, Newton, and Wallis, traditionally identified with the completion of the Scientific Revolution by the creation of classical terrestrial and celestial mechanics, achieved their celebrated results.
3. Structures and Achieements The cognitive, social, and material structures and achievements of the Scientific Revolution have been extensively studied and discussed amid much contro-
Scientific Reolution: History and Sociology of versy. Recent scholarship in the context of a cultural history of science has emphasized the possibility of an integrative treatment of these various dimensions. From the point of view of the traditional history of ideas, the Scientific Revolution appears primarily as the renewal of a knowledge development going back to antiquity, as a renaissance of Greek science. In fact, the ancient tradition of mathematical science, and in particular the ‘Elements’ of Euclid, provided the protagonists of the Scientific Revolution with the canonical model for a mathematical theory, a model which they systematically applied to new areas such as ballistics (Tartaglia) and even ethics (Spinoza). But the ancient tradition also had in stock designs for a theory of nature not based on Aristotelian views which they could hence exploit in their struggle against scholasticism, in particular Platonism and atomism. The ancient tradition finally offered a substantial corpus of knowledge in such domains as geometry, mathematical astronomy, and mechanics, serving as a point of departure for new scientific endeavors. The perception of their work as a renewal of antique science was typical of the self-image of the protagonists of the Scientific Revolution, who honored each other by such titles as that of a ‘new Archimedes.’ In short, the characterization of the Scientific Revolution as a renaissance of antique science by historians of ideas such as Koyre! is in agreement with the claims of contemporary scientists to have accomplished a radical break with Aristotelian scholasticism (Koyre! 1978). This characterization, however, is in apparent conflict with the results of studies inaugurated by scholars in the tradition of Duhem, pointing to a conceptual continuity between early modern science and its medieval predecessors (and even with contemporary scholasticism), in spite of the anti-Aristotelian polemics (Duhem 1996). Such studies have provided evidence that the intellectual means available to the engineer-scientists engaged in creating new sciences were essentially still rooted in traditional conceptual frameworks. For instance, Galileo Galilei and his contemporaries allowed their investigations of motion and mechanics to be shaped by such Aristotelian notions as the distinction between natural and violent motion and the assumption that violent motion is caused by a moving force (Damerow et al. 1992). Studies of medieval natural philosophy have revealed, on the other hand, that the advanced explanations of phenomena such as projectile motion, by which early modern scientists distinguished themselves from Aristotelian natural philosophy, make use of concepts of causation such as that of ‘impetus’ (or ‘impressed force’) and techniques for conceptualizing change such as Oresme’s doctrine of the ‘latitude of forms’ that had already been developed by late antique or medieval commentators on Aristotle (Clagett 1968). Such contrasting accounts of early modern science appear less incompatible if other dimensions of the
development of scientific knowledge are taken into account. For instance, epistemological considerations have suggested that one should differentiate between the claim of Renaissance engineer-scientists to have created a new science or a new scientific method, on the one hand, and the knowledge base they shared with their contemporaries, on the other hand. Using the terminology introduced by Elkana (1988) we can say that their ambitious claim resulted from their ‘image of knowledge,’ which was determined by their fragile social status. This image is to be distinguished from the shared ‘body of knowledge’ comprising the antique and medieval heritage which determined what challenges they were able to master. The role of this shared knowledge for the Scientific Revolution has been analyzed not only from the point of view of the history of ideas but also from the viewpoint of its actual function in the social and material contexts of this revolution. It has thus turned out that the material culture of the Scientific Revolution decisively shaped the way in which the knowledge of antique and medieval science was taken up or newly interpreted. It has become evident, for instance, that the knowledge resources available to the engineer-scientists, largely structured by traditional conceptual frameworks, were challenged by their application to the new objects of a rapidly expanding range of experience, acquired in the context of the practical ventures in which the engineer-scientists were engaged. While investigations of ‘challenging objects’ such as the pendulum, the trajectories of artillery, purified metals, the dissected human body, the terrestrial globe, and the planetary system often remained less successful than their protagonists hoped and claimed, they nevertheless triggered elaborations and modifications of these traditional frameworks creating the foundations for the accomplishments of the age of classical science. Columbus searched for the sea route to India, but discovered America. Kepler tried in the Pythagorean tradition to unravel the harmonies of the world but from the harmonies he found only three laws of planetary motion that became the starting point of Newtonian cosmology. Galileo explained ballistics and pendulum motion on the basis of the impetus concept, but in fact he contributed to a new mechanics, which expelled this concept once and for all from science. Harvey defended Aristotle’s claim of the primacy of the heart among the organs, but became the founder of an anti-Aristotelian theory, a mechanistic medicine. All these achievements resulted from coping with challenging new objects while relying on essentially traditional intellectual means. Classical science, often considered an accomplishment of the Scientific Revolution, was actually established for the most part only in the course of the eighteenth century. Even classical mechanics, the pilot and model science of the Scientific Revolution, assumed the formulation familiar from today’s physics textbooks only in the aftermath of Newton’s pion13751
Scientific Reolution: History and Sociology of eering work. The characteristics of classical science, relatively stable theoretical frameworks and generally accepted standards for the production of knowledge serving as the canonical reference for a scientific community, were, in any case, not yet shared by the scientific endeavors of the Scientific Revolution. The conceptual framework of classical science comprising basic concepts such as inertia, a new methodical canon including the experimental method and mathematical techniques such as the differential calculus, new images of knowledge such as the mechanization of the world view, and the new institutions of classical science such as the academies were nevertheless consequences of the Scientific Revolution, as a belated result of reflection on its accumulated experience. See also: Evolution: Diffusion of Innovations; History of Science; History of Technology; Innovation, Theory of; Kuhn, Thomas S (1922–96); Physical Sciences: History and Sociology; Technological Innovation
Bibliography Biagioli M 1993 Galileo, Courtier: The Practice of Science in the Culture of Absolutism. University of Chicago Press, Chicago Butterfield H 1965 The Origins of Modern Science 1300–1800. The Free Press, New York Clagett M H (ed.) 1968 Nicole Oresme and the Medieal Geometry of Qualities and Motions. University of Wisconsin Press, Madison, WI Cohen H F 1994 The Scientific Reolution: A Historiographical Inquiry. University of Chicago Press, Chicago Cohen I B 1985 Reolution in Science. Belknap Press, Cambridge, MA Damerow P, Freudenthal G, MacLaughlin P, Renn J 1992 Exploring the Limits of Preclassical Mechanics. Springer, New York Dijksterhuis E J 1986 The Mechanization of the World Picture: Pythagoras to Newton. Princeton University Press, Princeton, NJ Duhem P 1996 Essays in the History and Philosophy of Science. Hackett, Indianapolis, IN Elkana Y 1988 Experiment as a second order concept. Science in Context 2: 177–96 Feldhay R 1995 Galileo and the Church: Political Inquisition or Critical Dialogue? Cambridge University Press, Cambridge, UK Hall A R 1954 The Scientific Reolution 1500–1800: The Formation of the Modern Scientific Attitude. Longmans, Green, London Koyre! A 1978 Galileo Studies. Humanities Press, Atlantic Highlands, NJ Kuhn T S 1962 The Structure of Scientific Reolutions. University of Chicago Press, Chicago Lindberg D C, Westman R S (eds.) 1990 Reappraisals of the Scientific Reolution. Cambridge University Press, Cambridge, UK Osler M J (ed.) 2000 Rethinking the Scientific Reolution. Cambridge University Press, Cambridge, UK
13752
Porter R, Teich M (eds.) 1992 The Scientific Reolution in National Context. Cambridge University Press, Cambridge, UK Shapin S 1996 The Scientific Reolution. University of Chicago Press, Chicago Wallace W A 1991 Galileo, the Jesuits, and the Medieal Aristotle. Variorum, Aldershot, UK Zilsel E 2000 The Social Origins of Modern Science. Kluwer, Dordrecht, The Netherlands
P. Damerow and J. Renn
Scientometrics 1. Introduction Scientometrics can be defined as the study of the quantitative aspects of scientific communication, R&D practices, and science and technology (S&T) policies. The objective is to develop indicators of the intellectual and social organization or the sciences using network relations between scientific authors and texts. The specialty has developed in relation to the increased capacities of computer storage and information retrieval of scientific communications (e.g., citation analysis). Archival records of scientific communications contain institutional address information, substantive messages (e.g., title words), and relational information from which one is able to reconstruct patterns and identify the latent characteristics of both authors and document sets. Using scientometric techniques, one is thus able to relate institutional characteristics at the level of research groups with developments at the level of scientific disciplines and specialties. Citations, for example, can be used for retrieving documents on the basis of author names, or vice versa. The scientometric representation is formal: it remains in need of an interpretation. The focus on uncertainty contained in the distribution relates scientometrics additionally to the (neo-evolutionary) study of complex and adaptive systems. Simulation models are increasingly used for the study of the role of sciencebased technologies in innovation processes. However, the specialty remains data driven because of its mission to provide indicators to S&T policy processes and R&D management.
2. A Metric of Science? In 1978, the journal Scientometrics was launched as a new medium to stimulate the quantitative study of scientific communication. Derek de Solla Price, one of the founding fathers of the specialty, proclaimed in his introduction the development of scientometrics as the
Scientometrics emergence of a ‘relatively hard’ social science. This claim has generated discussion from the very beginning of the specialty. In that same year (1978), leading sociologists of science published an edited volume entitled Toward a Metric of Science: The Adent of Science Indicators, dedicated to ‘Paul F. Lazarsfeld (1901–76), Master of Quantitative and Qualitative Social Research’ (Elkana et al. 1978). The systematic comparison of science indicators across fields of science was made possible by the creation of the Science Citation Index by Eugene Garfield of the Institute of Scientific Information (Garfield 1979). A preliminary version of this index became available in 1962. The creation of the database has stimulated the development of new perspectives on studies in various traditions. For example, the growth of scientific disciplines and specialties can be discussed in quantitative terms using this database (e.g., Price 1963), ‘invisible colleges’ can be explained in terms of network structures (Crane 1969), and theories of citations can perhaps be developed and tested (cf. Braun 1998). The development of the specialty went hand in hand with the need for a means of legitimating science policies (Wouters 1999). Narin (1976) elaborated an instrumentarium for the systematic development of the biennial series of Science Indicators which the National Science Foundation of the USA began providing in 1972. (In 1987, the name of this series was changed to Science and Engineering Indicators.) With the further development of S&T policies in other nation-states and with the gradual emergence of such policies at the European level, scientometrics became a booming business during the 1980s. By comparing radio-astronomy facilities at the international level, Martin and Irvine (1983) showed the feasibility of comparing research groups in terms of quantitative performance indicators. In a special issue of Social Studies of Science about performance indicators, Collins (1985) raised the question of the unit of analysis: what is being assessed in terms of what? The authors, the papers, or the cognitions shaped in terms of sociocognitive interactions among authors and discourses? In relation to French traditions of linguistic analysis, Callon et al. (1983) proposed using words and their co-occurrences instead of citations and co-citations (Small 1973) as units of analysis. Citations can be considered as subtextual codifications, while words indicate the variation at the level of texts. Words may change both in meaning and in terms of their observable frequency distributions.
3. Methodologies The availability of well-organized relational databases on an annual basis challenges the scientometrician to develop comparative statistics. How is the structure in each of these years to be depicted, and how may
changes in structure be distinguished from and related to changes in the observable variation? Can the difference over time be identified as ‘growth of science,’ or is it mainly a difference between two measurement errors? How significant are differences when tested against simulation results? In principle, the idea of a dynamic mapping of science requires an independent operationalization of structural (that is, latent) dimensions of the maps and observable variation which is to be pencilled into these maps. Science, however, develops not only in terms of the variation, but also by changing its structural dimensions. Because of the prevailing reflexivity within the science system, previous structures can be felt as constraints and at the same time be used as resources. The construction of structure may historically be stabilized, but reflexive actors are able to deconstruct and to assess the previous constructions with hindsight. The methodological apparatus for the mapping of science in terms of multivariate statistics (multidimensional scaling, cluster analysis, etc.) has been a product of the 1980s (e.g., Van Raan 1988). The 1990s have provided the evolutionary turn: what does history mean in relation to (envisaged?) future options? How can the system itself be informed reflexively with respect to its self-organizing capacities (Leydesdorff 1995)? The relation to technometrics and the measurement of ‘systems of innovation’ has become central to a shifting research agenda. The development of the sciences has increasingly been contextualized in relation to science-based technologies and innovation systems (Gibbons et al. 1994). 3.1 Comparatie Statistics of Science Various methods for the mapping of science can be based on relational indicators such as citations, words, co-occurrences of each of these categories, etc. Clustering, however, requires the choice of a similarity criterion and a clustering algorithm. The distinction between positional (or factor analytical) and relational (or graph analytical) approaches is also relevant: the network is expected to contain an architecture in which actors have a position. A nearby position does not necessarily imply a relation. The methodological reflection may thus help to clarify the theoretical analysis (Burt 1982). The mapping requires a specific perspective for the projection. Each perspective assumes a position or a window. If the multidimensional space is highly structured, the different positions may provide nearly incommensurable projections. In an article on the development of the relation between scientometrics and other subfields of S&T studies, Leydesdorff and Van den Besselaar (1997) showed that the mappings depicting science studies from the perspective of the journals Scientometrics, Social Studies of Science, and Research Policy, respectively, are increasingly different 13753
Scientometrics example, depicts the emergence of the modern citation itself as a historical phenomenon, but using theoretically informed helplines. Scientific citation emerged around the turn of the twentieth century as a means to search in both the textual and the social dimensions of science. Thus, the citation can be considered as an indicator of the complexity of sociocognitive interaction in science after its institutionalization in the nineteenth century.
3.3 Neo-eolutionary Methodologies
Figure 1 The emergence of the modern citation in the February issues of Journal of the American Chemical Society (1890–1910) (after Leydesdorff and Wouters 1999, p. 175).
during the 1990s. Citation relations among these core journals tend to decrease. The authors characterize Social Studies of Science as the ‘codifier’ of the field (along the historical axis), Scientometrics as the ‘formalizer,’ and Research Policy as the ‘utilizer.’ Only a few scholars in ‘Science, Technology, and Innovation Studies’ have developed competences for communicating across these subdisciplinary boundaries. 3.2 Time-series Methodologies During the 1980s a debate raged in the community concerning the scientometric indication of a ‘decline of British science.’ Eventually, a special issue of Scientometrics was devoted to the debate in 1991 (Martin 1991). Some agreement could be reached that the inclusion and exclusion of data types and the framework for the comparison can be crucial for the dynamic evaluation. Should one compare with reference to a previous stage (for example, in terms of an ex ante fixed journal set), or should one with hindsight reconstruct the relevant data? For example, not only has the number of biotechnology journals changed, but also our understanding of ‘biotechnology’ is changing continuously. The sociological understanding of scientific knowledge production and control seems to have eroded Price’s dream of developing scientometrics as a relatively ‘hard social science.’ Using time-series analysis, one is always able to increase the fit of the curve by allowing for higherorder polynomials. Here again, the theoretical appreciation has to guide the choices of the parameters. For example, if one wishes to measure growth, it may be useful to include a second- or third-order polynomial in addition to the linear fit. Figure 1, for 13754
The networks of texts and the networks of authors operate upon each other in a selective mode. The distributions are expected to contain information, and this information may be increasingly codified by recurrent selections. As the systems ‘lock-in’ (in terms of their mutual information), the closure of the communication into a paradigm is one among various possibilities. The uncertainties which prevail in these networks can interactively generate codifications, which can be expected to perform a ‘life-cycle.’ Both participants and observers are able to hypothesize these structures reflexively, and new information can be expected to induce an update. Thus, the codification drives the knowledge production process. Codification also provides instruments for local control of otherwise global developments. The relational operation is recursive. For example, citations refer to other texts and\or to citations in other texts. The networks resulting from this operation are expected to have an architecture (which can be mapped at each moment in time). Operations are expected to be reproduced if they are able to further the production of new knowledge and the latter’s retention into cognitive structures. What is functional, however, is decided at a next moment in time, that is, at the level of a reflexive hypernetwork that overlays the historically generated ones. The distributions indicate the patterns of the expected operations. Thus, the process of new knowledge claims is propelled and made more precise and selective in tradeoffs of references in social and cognitive dimensions. Scientometric studies can be helpful in revealing the patterns of intellectual and social organization which may have remained ( partially) latent to the knowledgable actors involved. Simulation studies using scientometric mappings as input enable us to indicate the difference that the moves of the players can make. The complexity of the scientists’ worlds is reflected in the scientometric reconstructions. The recognition of these objectified reconstructions recursively assumes and potentially refines the cognition within the discourses at both levels. Over time, the cognitive reconstruction becomes thoroughly selective: citations may be ‘obliterated by incorporation’ into the body of knowledge, and social factors may play a role in further selections, e.g., in
Screening and Selection terms of reputations. In this co-evolution between communications and authors, distributions of citations function, among other things, as contested boundaries between specialties. Since the indicators are distributed, the boundaries remain to be validated. Functions are expected to change when the research front moves further. By using references, authors position their knowledge claims within one specialty area or another. Some selections are chosen for stabilization, for example, when codification into citation classics occurs. Some stabilizations are selected for globalization at a next-order level, for example, when the knowledge component is integrated into a technology.
4. Conclusion The focus on evolutionary dynamics relates scientometrics increasingly with the further development of evolutionary economics (Leydesdorff and Van den Besselaar 1994). How can systems of innovation be delineated? How can the complex dynamics of such systems be understood? How is the ( potentially random) variation guided by previously codified expectations? How can explorative variation be increased in otherwise ‘locked-in’ trajectories of technological regimes or paradigms? From this perspective, the indication of newness may become more important than the indication of codification. The Internet, of course, offers a research tool for what has also now been called ‘sitations’ (Rousseau 1997). ‘Webometrics’ may develop as a further extension of scientometrics relating this field with other subspecialities of science and technology studies, such as the public understanding of science or the appropriation of technology and innovation using patent statistics.
Crane D 1969 Social structure in a group of scientists. American Sociological Reiew 36: 335–352 Elkana Y, Lederberg J, Merton R K, Thackray A, Zuckerman H 1978 Toward a Metric of Science: The Adent of Science Indicators. Wiley, New York Garfield E 1979 Citation Indexing. Wiley, New York Gibbons M, Limoges C, Nowotny H, Schwartzman S, Scott P, Trow M 1994 The New Production of Knowledge: The Dynamics of Science and Research in Contemporary Societies. Sage, London Leydesdorff L, Van den Besselaar P (eds.) 1994 Eolutionary Economics and Chaos Theory: New Directions in Technology Studies. Pinter, London Leydesdorff L, Van den Besselaar P 1997 Scientometrics and communication theory: Towards theoretically informed indicators. Scientometrics 38: 155–74 Leydesdorff L, Wouters P 1999 Between texts and contexts: Advances in theories of citation? Scientometrics 44: 169–182 Leydesdorff L A 1995 The Challenge of Scientometrics: The Deelopment, Measurement, and Self-organization of Scientific Communications. DSWO Press, Leiden University, Leiden, Netherlands Martin B R 1991 The bibliometric assessment of UK scientific performance. A reply to Braun, Gla$ nzel and Schubert. Scientometrics 20: 333–57 Martin B R, Irvine J 1983 Assessing basic research: Some partial indicators of scientific progress in radio astronomy. Research Policy 12: 61–90 Narin F 1976 Ealuatie Bibliometrics. Computer Horizons Inc., Cherry Hill, NJ Price D, de Solla 1963 Little Science, Big Science. Columbia University Press, New York Raan A F J Van (ed.) 1988 Handbook of Quantitatie Studies of Science and Technology. Elsevier, North-Holland, Amsterdam Rousseau R 1997 Sitations: An exploratory study. Cybermetrics 1: 1 http:\\www.cindoc.csic.es\cybermetrics\articles\v1i1p1. html Small H 1973 Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science 24: 265–9 Wouters P 1999 The citation culture. PhD Thesis, University of Amsterdam
See also: Communication: Electronic Networks and Publications; History of Science; Libraries; Science and Technology, Social Study of: Computers and Information Technology; Science and Technology Studies: Experts and Expertise
L. Leydesdorff
Screening and Selection Bibliography Braun T (ed.) 1998 Topical discussion issue on theories of citation. Scientometrics 43: 3–148 Burt R S 1982 Toward a Structural Theory of Action. Academic Press, New York Callon M, Courtial J-P, Turner W A, Bauin S 1983 From translations to problematic networks: an introduction to coword analysis. Social Science Information 22: 191–235 Collins H M 1985 The possibilities of science policy. Social Studies of Science 15: 554–8
1. Introduction In order to assist in selecting individuals possessing either desirable traits such as an aptitude for higher education or skills needed for a job or undesirable ones such as having an infection, illness or a propensity to lie, screening tests are used as a first step. A more rigorous selection device, e.g., diagnostic test or detailed interview is then used for the final classification. 13755
Screening and Selection For some purposes, such as screening blood donations for a rare infection, the units classified as positive are not donated but further tests on donors may not be given as most will not be infected. Similarly, for estimating the prevalence of a trait in a population, the screening data may suffice provided an appropriate estimator that incorporates the error rates of the test is used (Hilden 1979, Gastwirth 1987, Rahme and Joseph 1998). This article describes the measures of accuracy used to evaluate and compare screening tests and issues arising in the interpretation of the results. The importance of the prevalence of the trait on the population screened and the relative costs of the two types of misclassification are discussed. Methods for estimating the accuracy rates of screening tests are briefly described and the need to incorporate them in estimates of prevalence is illustrated.
2. Basic Concepts The purpose of a screening test is to determine whether a person or object is a member of a particular class, C or its complement, Ck. The test result indicating that the person is in C will be denoted by S and a result indicating non-membership by Sk. The accuracy of a test is described by two probabilities: η l P [S Q C ] being the probability that someone in C is correctly classified, or the sensitivity of the test; and θ l P [SkQ Ck] being the probability that someone not in C is correctly classified, or the specificity of the test. Given the prevalence, π l P(C ), of the trait in the population screened, from Bayes’ theorem it follows that the predictive value of a positive test (PVP) is P [C Q S ] l πη\ [πηj(1kπ) (1k θ )]. Similarly, the predictive value of a negative test is P[CkQ Sk] l (1kπ)θ\[(1kπ) θjπ (1kη)]. In the first two sections, we will assume that the accuracy rates and the prevalence are known. When they are estimated from data, appropriate sampling errors for them and the PVP are given in Gastwirth (1987). For illustration, consider an early test for HIV, having a sensitivity of 0.98 and a specificity of 0.93, applied in two populations. The first has a very low prevalence, 1.0i10−$, of the infection while the second has a prevalence of 0.25. From Eqn. (1), the PVP in the first population equals 0.0138, i.e., only about one-andone-half percent of individuals classified as infected 13756
would actually be so. Nearly 99 percent would be false positives. Notice that if the fraction of positives, expected to be 0.0797, in the screened data were used to estimate prevalence, a severe overestimate would result. Adjusting for the error rates yields an accurate estimate. In the higher prevalence group, the PVP is 0.8235, indicating that the test could be useful in identifying individuals. Currently-used tests have accuracy rates greater than 0.99, but even these still have a PVP less than 0.5 when applied to a low prevalence population. A comprehensive discussion is given in Brookmeyer and Gail (1994). The role of the prevalence, also called the base rate, of the trait in the screened population and how well people understand its effect has been the subject of substantial research, recently reviewed by Koehler (1996). An interesting consequence is that when steps are taken to reduce the prevalence of the trait prior to screening, the PVP declines and the fraction of falsepositives increases. Thus, when high-risk donors are encouraged to defer or when background checks eliminate a sizeable fraction of unsuitable applicants prior to their being subjected to polygraph testing the fraction of classified positives who are truly positive is small. In many applications screening tests yield data that are ordinal or essentially continuous, e.g., scores on psychological tests or the concentration of HIV antibodies. Any value, t, can be used as a cut-off point to delineate between individuals with the trait and ‘normal.’ Each t generates a corresponding sensitivity and specificity for the test and the user must then incorporate the relative costs of the two different errors and the likely prevalence of the trait in the population being screened to select the cut off. The receiver operating characteristic (ROC) curve displays the trade-off between the sensitivity and specificity defined by various choices of t and also yields a method for comparing two or more screening tests. To define the ROC curve, assume that the distribution of the measured variable (test score or physical quantity) is F(x) for the ‘normal’ members of the population but is G(x) for those with the trait. The corresponding density functions are f(x) and g(x) respectively and g(x) will be shifted (to the right (left) if large (small) scores indicate the trait) of f (x) for a good screening test. Often one fixes the probability (α l 1ky θ or one minus the specificity) of classifying a person without the characteristic as having it at a small value. Then t is determined from the equation F (t) l 1k θ. The sensitivity, η, of the test is 1kG(t ). The ROC curve plots η against 1k θ. A perfect test would have η l 1 so the closer the ROC is to the upper left corner in Fig. 1, the better the screening test. In Fig. 1 we assume f is a normal density with mean 0 and variance 1 while g is normal with mean 1 and the same " variance. For comparison we also graphed the ROC for a second test, which has a density g with mean 2 and variance 1. Notice that the ROC #curve for the
Screening and Selection
Figure 1 The ROC curves for screening tests 1 and 2. The solid line is the curve when the diseased group has mean 1 and the dashed curve is for the second group with mean 2
second test is closer to the left corner (0,1) than that of the first test. A summary measure (Campbell 1994), which is useful in comparing two screening tests is the area, A, under the ROC. The closer A is to its maximum value of 1.0, the better the test is. In Fig. 1, the areas under the ROC curves for the two tests are 0.761 and 0.922, respectively. Thus, the areas reflect the fact that the ROC curve for the second test is closer to what a ‘perfect’ test would be. This area equals the probability a randomly chosen individual will have a higher score on the screening test than a normal one. This probability is the expected value of the Mann–Whitney form of the Wilcoxon test for comparing two distributions and methods for estimating it are in standard non-parametric statistics texts. Non-parametric methods for estimating the entire ROC curve are given by Wieand et al. (1989) and Hilgers (1991) obtained distribution-free confidence bounds for it. Campbell (1994) uses the confidence bounds on F and G to construct a joint confidence interval for the sensitivity and one minus the specificity in addition to proposing alternative confidence bounds for the ROC itself. Hseih and Turnbull (1996) determine the value of t that maximizes the sum of sensitivity and specificity. Their approach can be extended to maximizing weighted average of the two accuracy rates, suggested by Gail and Green (1976). Wieand et al. (1989) also developed related statistics focusing on the portion of the ROC lying above a
region, α and α so the analysis can be confined to # that are practically useful. Greenvalues of "specificity house and Mantel (1950) determine the sample sizes needed to test whether both the specificity and sensitivity of a test exceed pre-specified values. The area A under the ROC can also be estimated using parametric distributions for the densities f and g. References to this literature and an alternative approach using smoothed histograms to estimate the densities is developed in Zou et al. (1997). They also consider estimating the partial area over the important region determined by two appropriate small values of α. The tests used to select employees need to be reliable and valid. Reliability means that replicate values are consistent while validity means that the test measures what it should, e.g., successful academic performance. Validity is often assessed by the correlation between the test score (X ) and subsequent performance (Y ). Often X and Y can be regarded as jointly normal random variables, especially as monotone transformations of the raw scores can be used in place of them. If a passing score on the screening or pre-employment test is defined as X t and successful performance is defined as Y d, then the sensitivity of the test is P[X t Q Y d ], the specificity is P[X t Q Y d ] and the prevalence of the trait is P[Y d ] in the population of potential applicants. Hence, the aptitude and related tests can be viewed from the general screening test paradigm. When the test and performance scores are scaled to have a standard bivariate normal distribution, both the sensitivity and specificity increase with the correlation, ρ. For example, suppose one desired to obtain employees in the upper half of the performance distribution and used a cut-off score, t, of one-standard deviation above the mean on the test (X ). When ρ l 0.3, the sensitivity is 0.217 while the specificity is 0.899. If ρ l 0.5, the sensitivity is 0.255 and the specificity is 0.937. The use of a high cut-off score eliminates the less able applicants but also disqualifies a majority of applicants who are in the upper half of the performance distribution. Reducing the cut-off score to one-half a standard deviation above the average raises the sensitivities to 0.394 and 0.454 for the two values of ρ but lowers the corresponding specificities to 0.777 and 0.837. This trade-off is a general phenomenon as seen in the ROC curves.
3. The Importance of the Context in Interpreting the Results of Screening Tests In medical and psychological applications, an individual who tests positive for a disease or condition on a screening test will be given a more accurate confirmatory test or intensive interview. The cost of a ‘false positive’ screening result on a medical exam is 13757
Screening and Selection often considered very small relative to a ‘false negative,’ which could lead to the failure of suitable treatment to be given in a timely fashion. A false positive result presumably would be identified in a subsequent more detailed exam. Similarly, when government agencies give employees in safety or security sensitive jobs a polygraph test, the loss of potentially productive employee due to a false positive was deemed much less than the risk of hiring an employee who would might be a risk to the public or a security risk. One can formalize the issue by including the costs of various errors and the prevalence, π, of the trait in the population being screened in determining the cut-off value of the screening test. Then the expected cost, which weights the probability of each type of error by its cost is given by wα(1kπ)j(1kw) πG (t). Here the relative costs of a false positive (negative) are w and 1kw, respectively and as before t l F−" (1kα). The choice of cut-off value, to, minimizing the expected cost satisfies: g (t) w (1kπ) l f (t) (1kw) π
(1)
Whenever the ratio, g: f, of the density functions is a monotone function the previous equation yields an optimum cut-off point, to, which depends on the costs and prevalence of the trait. Note that for any value of π, the greater the cost of a false positive, the larger will be the optimum value, to. This reflects the fact that the specificity needs to be high in order to keep the false positive rate low. Although the relative costs of the two types of error are not always easy to obtain and the prevalence may only be approximately known, Eqn. (1) may aid in choosing the critical value. In practice, one should also assess the effect slight changes in the costs and assumed prevalence have on the choice of the cut-off value. The choice of t that satisfies condition (1) may not be optimal if one desires to estimate the prevalence of the trait in a population rather than classifying individuals. Yanagawa and Tokudome (1990) determine t when the objective is to minimize the relative absolute error of the estimator of prevalence on the basis of the screening test results. The HIV\AIDS epidemic raised questions about the standard assumptions about the relative costs of the two types of error. A ‘false positive’ classification would not only mean that a well individual would worry until the results of the confirmatory test were completed, they also might have social and economic consequences if friends or their employer learned of the result. 13758
Similar problems arise in screening blood donors and in studies concerning the association of genetic markers and serious diseases. Recall that the vast majority of donors or volunteers for genetic studies are doing a public service and are being screened to protect others or advance knowledge. If a donation tests positive, clearly it should not be used for transfusion. Should a screened-positive donor be informed of their status? Because the prevalence of infected donors is very small, the PVP is quite low so that most of the donors screened positive are ‘false.’ Thus, blood banks typically do not inform them and rely on approaches to encourage donors from highrisk groups to exclude themselves from the donor pool (Nusbacher et al. 1986). Similarly, in a study (Hartge et al. 1998) of the prevalence of mutations in two genes that have been linked to cancer the study participants were not notified of their results. The screening test paradigm is useful in evaluating tests used to select employees. The utility of a test depends on the costs associated with administering the test and the costs associated with the two types of error. Traditionally, employers focused on the costs of a false positive, hiring an employee who does not perform well, such as termination costs, and the possible loss of customers. The costs of a false negative are more difficult to estimate. The civil-rights law, which was designed to open job opportunities to minorities, emphasized the importance of using appropriate tests, i.e., tests that selected better workers. Employers need to check whether the tests or job requirements (e.g., possession of a high school diploma) have a disparate impact upon a legally protected group. When they exclude a significantly greater fraction of minority members than majority ones, the employer needs to validate it, i.e., show it is predictive of on the job performance. Arvey (1979) and Paetzold and Willborn (1994) discuss these issues.
4. Estimating the Accuracy of the Screening Tests So far, we have assumed that we can estimate the accuracy of the screening tests on samples from two populations where the true status of the individuals is known with certainty. In practice, this is often not the case and can lead to biased estimates of the sensitivity and specificity of a screening test, as some of the individuals believed to be normal have the trait, and vice versa. If one has samples from only one population to which to apply both the screening and confirmatory test, then one cannot estimate the accuracy rates. The data would be organized into a 2i2 table, with four cells, only three of which are independent. There are five parameters, however, the two accuracy rates of the two tests plus the prevalence of the trait in the
Screening and Selection population. In some situations, the prevalence of the trait may vary amongst sub-populations. If one can find two such sub-populations and if the accuracy rates of both tests are the same in both of those subpopulations, then one has two 2i2 tables with six independent cells, with which to estimate six parameters. Then estimation can then be carried out (Hui and Walter 1980). This approach assumes that the two tests are conditionally independent given the true status of the individual. When this assumption is not satisfied, Vacek (1985) showed that the estimates of sensitivity and specificity of the tests are biased. This topic is an active area of research, recently reviewed by Hui and Zhou (1998). A variety of latent-class models have been developed that relax the assumption of conditional independence (see Faraone and Tsuang 1994 and Yang and Becker 1997 and the literature they cite).
5. Applications and Future Concerns Historically screening tests were used to identify individuals with a disease or trait, e.g., as a first stage in diagnosing medical or psychological conditions or select students or employees. They are being increasingly used, often in conjunction with a second, confirmatory test, in prevalence surveys for public health planning. The techniques developed are often applicable, with suitable modifications, to social science surveys. Some examples of prevalence surveys illustrate their utility. Katz et al. (1995) compared two instruments for determining the presence of psychiatric disorders in part to assess the needs for psychiatric care in the community and the available services. They found that increasing the original cut-off score yielded higher specificity without a substantial loss of sensitivity. Similar studies were carried out in Holland by Hodiamont et al. (1987) who found a lower prevalence (7.5 percent) than the 16 percent estimate in York. The two studies, however, used different classification systems illustrating that one needs to carefully examine the methodology underlying various surveys before making international comparisons. Gupta et al. (1997) used several criteria based on the results of an EKG and an individual’s medical history to estimate the prevalence of heart disease in India. Often one has prior knowledge of the prevalence of a trait in a population, especially if one is screening similar populations on a regular basis, as would be employers, medical plans, or blood centers. Bayesian methods incorporate this background information and can yield more accurate estimates (see Geisser 1993, and Johnson, Gastwirth and Pearson 2001). A cost-effective approach is to use an inexpensive screen at a first stage and retest the positives with a more
definitive test. Bayesian methodology for such studies was developed by Erkanli et al. (1997). The problem of misclassification arises often in questionnaire surveys. Laurikka et al. (1995) estimated the sensitivity and specificity of self-reporting of varicose veins. While both measures were greater than 0.90, the specificity was lower (0.83) for individuals with a family history than those with negative histories. Sorenson (1998) observed that often selfreports are accepted as true and found that the potential misclassification could lead to noticeable (10 percent) errors in estimated mortality rates. The distortion misclassification errors can have on estimates of low prevalence traits because of the high fraction of false positive classifications, was illustrated in the earlier discussion of screening tests for HIV\ AIDS. Hemenway (1997) applies these concepts to demonstrate that surveys typically overestimate rare events; in particular the self-defense uses of guns. Thus it is essential to incorporate the accuracy rates into the prevalence estimate (Hilden 1979, Gastwirth 1987, Rahme and Joseph 1998). Sinclair and Gastwirth (1994) utilized the HuiWalter paradigm to assess the accuracy of both the original and re-interview (by supervisors) classifications in labor force surveys. In its evaluation, the Census Bureau assumes that the re-interview data are correct; however, those authors found that both interviews had similar accuracy rates. In situations where one can obtain three or more classifications, all the parameters are identifiable (Walter and Irwig 1988) Gastwirth and Sinclair (1998) utilized this feature of the screening test approach to suggest an alternative design for judge–jury agreement studies that had another expert, e.g., law professor or retired judge, assess the evidence.
6. Conclusion Many selection or classification problems can be viewed from the screening test paradigm. The context of the application determines the relative costs of a misclassification or erroneous identification. In criminal trials, society has decided that the cost of an erroneous conviction far outweighs the cost of an erroneous acquittal. While, in testing job applicants, the cost of not hiring a competent worker is not as serious. The two types of error vary with the threshold or cut-off value and the accuracy rates corresponding to these choices is summarized by the ROC curve. There is a burgeoning literature in this area as researchers are incorporating relevant covariates, e.g., prior health status or educational background into the classification procedures. Recent issues of Biometrics and Multiariate Behaioral Research, Psychometrika and Applied Psychological Measurement as well as the medical journals cited in the article contain a variety of 13759
Screening and Selection articles presenting new techniques and applications of them to the problems discussed. See also: Selection Bias, Statistics of
Bibliography Arvey R D 1979 Fairness in Selecting Employees. AddisonWesley, Reading, MA Brookmeyer R, Gail M H 1994 AIDS Epidemiology: A Quantitatie Approach. Oxford University Press, New York Campbell G 1994 Advances in statistical methodology for the evaluation of diagnostic and laboratory tests. Statistics in Medicine 13: 499–508 Erkanli A, Soyer R, Stangl D 1997 Bayesian inference in twophase prevalence studies. Statistics in Medicine 16: 1121–33 Faraone S V, Tsuang M T 1994 Measuring diagnostic accuracy in the absence of a gold standard. American Journal of Psychiatry 151: 650–7 Gail M H, Green S B 1976 A generalization of the one-sided two-sample Kolmogorov–Smirnov statistic for evaluating diagnostic tests. Biometrics 32: 561–70 Gastwirth J L 1987 The statistical precision of medical screening procedures: Application to polygraph and AIDS antibodies test data (with discussion). Statistical Science 2: 213–38 Gastwirth J L, Sinclair M D 1998 Diagnostic test methodology in the design and analysis of judge–jury agreement studies. Jurimetrics Journal 39: 59–78 Geisser S 1993 Predictie Inference. Chapman and Hall, London Greenhouse S W, Mantel N 1950 The evaluation of diagnostic tests. Biometrics 16: 399–412 Gupta R, Prakash H, Gupta V P, Gupta K D 1997 Prevalence and determinants of coronary heart disease in a rural population of India. Journal of Clinical Epidemiology 50: 203–9 Hartge P, Struewing J P, Wacholder S, Brody L C, Tucker M A 1999 The prevalence of common BRCA1 and BRCA2 mutations among Ashkenazi jews. American Journal of Human Genetics 64: 963–70 Hemenway D 1997 The myth of millions of annual self-defense gun uses: A case study of survey overestimates of rare events. Chance 10: 6–10 Hilden J 1979 A further comment on ‘Estimating prevalence from the results of a screening test.’ American Journal of Epidemiology 109: 721–2 Hilgers R A 1991 Distribution-free confidence bounds for ROC curves. Methods and Information Medicine 30: 96–101 Hodiamont P, Peer N, Syben N 1987 Epidemiological aspects of psychiatric disorder in a Dutch health area. Psychological Medicine 17: 495–505 Hseih F S, Turnbull B W 1996 Nonparametric methods for evaluating diagnostic tests. Statistics Sinica 6: 47–62 Hui S L, Walter S D 1980 Estimating the error rates of diagnostic tests. Biometrics 36: 167–71 Hui S L, Zhou X H 1998 Evaluation of diagnostic tests without gold standards. Statistical Methods in Medical Research 7: 354–70 Johnson W O, Gastwirth J L, Pearson L M 2001 Screening without a gold standard: The Hui–Walter paradigm revisited. American Journal of Epidemiology 153: 921–4 Katz R, Stephen J, Shaw B F, Matthew A, Newman F,
13760
Rosenbluth M 1995 The East York health needs study— Prevalence of DSM-III-R psychiatric disorder in a sample of Canadian women. British Journal of Psychiatry 166: 100–6 Koehler J J 1996 The base rate fallacy reconsidered: descriptive, normative and methodological challenges (with discussion). Behaioral and Brain Sciences 19: 1–53 Laurikka J, Laara E, Sisto T, Tarkka M, Auvinen O, Hakama M 1995 Misclassification in a questionnaire survey of varicose veins. Journal of Clinical Epidemiology 48: 1175–8 Nusbacher J, Chiavetta J, Naiman R, Buchner B, Scalia V, Horst R 1986 Evaluation of a confidential method of excluding blood donors exposed to human immunodeficiency virus. Transfusion 26: 539–41 Paetzold R, Willborn S 1994 Statistical Proof of Discrimination. Shepard’s\McGraw Hill, Colorado Springs, CO Rahme E, Joseph L 1998 Estimating the prevalence of a rare disease: adjusted maximum likelihood. The Statistician 47: 149–58 Sinclair M D, Gastwirth J L 1994 On procedures for evaluating the effectiveness of reinterview survey methods: Application to labor force data. Journal of American Statistical Assn 91: 961–9 Sorenson S B 1998 Identifying Hispanics in existing databases. Ealuation Reiew 22: 520–34 Vacek P M 1985 The effect of conditional dependence on the evaluation of diagnostic tests. Biometrics 41: 959–68 Walter S D, Irwig L M 1988 Estimation of test error rates, disease prevalence and relative risk from misclassified data: A review. Journal of Clinical Epidemiology 41: 923–37 Wieand S, Gail M H, James B R, James K 1989 A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data. Biometrika 76: 585–92 Yanagawa T, Tokudome S 1990 Use of screening tests to assess cancer risk and to estimate the risk of adult T-cell leukemia\ lymphoma. Enironmental Health Perspecties 87: 77–82 Yang I, Becker M P 1997 Latent variable modeling of diagnostic accuracy. Biometrics 52: 948–58 Zou K H, Hall W J, Shapiro D E 1997 Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests. Statistics in Medicine 16: 214–56
J. L. Gastwirth
Search, Economics of The economics of search study the implications of market frictions for economic behavior and market performance. ‘Frictions’ in this context include anything that interferes with the smooth and instantaneous exchange of goods and services. The most commonly-studied problems arise from imperfect information about the location of buyers and sellers, their prices, and the quality of the goods and services that they trade. The key implication of these frictions is that individuals are prepared to spend time and other resources on exchange; they search before buying or selling. The labor market has attracted most
Search, Economics of theoretical and empirical interest in this area of research, because of the heterogeneities that characterize it and the existence of good data on flows of workers and jobs between activity and inactivity, which can be used to test its propositions.
1. Historical Background The first formal model of individual behavior, due to Stigler (1961), was in the context of a goods market: choosing the optimal number of sellers to search before buying at the lowest price. Stigler’s rule, known as a fixed-sample rule, was abandoned in favor of sequential stopping rules: choosing the optimal reservation price and buying at the first store encountered which sells at or below the reservation price (see McCall 1970 for an early influential paper). The first big momentum to research in the economics of search came with the publication of Phelps et al. (1970), which showed that search theory could be used to analyze the natural rate of unemployment and the inflation-unemployment trade off, the central research questions of macroeconomics at that time. Although interest in the inflation-unemployment trade-off has since waned, interest in search theory as a tool to analyze transitions in the labor market and equilibrium unemployment has increased. The momentum in this direction came in the 1980s, when contributions by Diamond (1982), Mortensen (1982) and Pissarides (1985) showed that search theory could be used to construct equilibrium models of the labor market with more accurate predictions than the traditional neoclassical model (see Pissarides 2000, Mortensen and Pissarides 1999a, 1999b for reviews). The appearance of comprehensive data on job and worker flows, which can be studied with the tools of search theory, also contributed to interest in this direction (see Leonard 1987, Dunne et al. 1989, Davis et al. 1996, Blanchard and Diamond 1990). This article reviews major developments since the mid 1980s, with explicit reference to labor markets.
2. Job Search An individual has one unit of labor to sell to firms, which create jobs. The valuation of labor takes place under the assumptions that agents have infinite horizons and discount future income flows at the constant rate r, they know the future path of prices and wages and the stochastic processes that govern the arrival of trading partners, and they maximize the present discounted value of expected incomes. Let Ut be the expected present discounted value of a unit of labor before trade at time t (the ‘value’ of an unemployed worker) and Wt the expected value of an employed
worker. During a short time interval δt the unemployed worker receives income bδt, and a job offer arrives with probability aδt. The frictions studied in the economics of search are summarized in the arrival process. In the absence of frictions, a 4 _. With frictions and search, a 0; with no search, a l 0. A large part of the literature is devoted to specifying the arrival process, an issue addressed in Sect. 3.1. The choice of search intensity has also been studied, by making a an increasing function of search effort, but this issue is not addressed here (see Pissarides 2000, Chap. 5). If a job offer arrives, the individual has the option of taking it, for an expected return Wt+δt; or not taking it and keeping instead return Ut+δt. If no offer arrives, the individual’s return is Ut+δt. Therefore, with discount rate r, Ut satisfies the Bellman equation Ut l bδtjaδt
max (Wt+δt, Ut+δt) U j(1kaδt ) t+δt (1) 1jrδt 1jrδt
Rearrangement of terms yields U kUt rUt l bja (max (Wt+δt, Ut+δt)kUt+δt)j t+δt δt
(2)
Taking the limit of (2) as δt _, and omitting subscripts for convenience, yields rU l bja (max (W, U )kU )jU}
(3)
where UI denotes the rate of change of U. Equation (3) is a fundamental equation in the economics of search. It can be given the interpretation of an arbitrage equation for the valuation of an asset in a perfect capital market with risk-free interest rate r. This asset yields coupon payment b and at some rate a, it gives its holder the option of a discrete change in its valuation, from U to W. Optimality requires that the option is taken (and the existing valuation given up) if W U. The last term, UI , shows capital gains or losses due to changes in the market valuation of the asset. In most of the economics of labor-market search, however, research concentrates on ‘steady states,’ namely, on situations where the discount rate, transition rates, and income flows are all constant. With infinite horizons there are then stationary solutions to the valuation equations, obtained from (3) with UI l 0. One simple way of solving (3) is to assume that employment is an ‘absorbing state,’ so when a job that offers wage w is accepted, it is kept for life. Then, W l w\r, and if the individual is sampling from a known wage offer distribution F (w), the stationary version of (3) satisfies rU l bja( max (w\r, U ) dF (w)kU)
(4) 13761
Search, Economics of The option to accept a job offer is taken if w\r U, giving the reseration wage eqn. ξ l rU
(5)
The reservation wage is defined as the minimum acceptable wage, and it is obtained as the solution to (4) and (5). Partial models of search and empirical research in the duration of unemployment have explored generalized forms of Eqn. (4) to derive the properties of transitions of individuals from unemployment to employment (see Devine and Kiefer 1991). For a known F (w) with upper support A, Eqn. (4) specializes to a ξ l bj r
& (wkξ ) dF (w) A
(6)
ξ
Various forms of (6) have been estimated by the empirical literature or used in the construction of partial models of the labor market. The transition from unemployment to employment (the unemployment ‘hazard rate’) is a (1k F (ξ)) and it depends both on the arrival of offers and on the individual’s reservation wage. a, r, and the parameters of the wage offer distribution can be made to depend on the individual’s characteristics. The empirical literature has generally found that unemployment compensation acts as a disincentive on individual transitions through the influence of b on the reservation wage, but the effect is not strong. The offer arrival rate increases the hazard rate, despite the fact that the reservation wage increases in a. A number of personal characteristics influence reservation wages and transitions, including age, education, and race.
3. Two-sided Matching for Gien Wages Recent work in the economics of search has focused mainly on the equilibrium implications of frictions and search decisions. An equilibrium model needs to specify the decisions of firms and solve for the offer arrival rate a. In addition, a mechanism is needed to ensure that search is an ongoing process. The latter is achieved by introducing a probability λδt that a negative shock will hit a job during a short time interval δt. When the negative shock arrives the job is closed down (‘destroyed’), and the worker has to search again to find another job. For the moment, λ is assumed to be a positive constant (see Sect. 5).
3.1. The Aggregate Matching Function To derive the equilibrium offer arrival rate, suppose that at time t there are u unemployed workers and vacant jobs. In a short time interval δt each un13762
employed worker moves to employment with probability aδt, so in a large market the total flow of workers from unemployment to employment, and the total flow of jobs from vacant state to production, are both auδt. A key assumption in the equilibrium literature is that the total flows satisfy an aggregate matching function. The aggregate matching function is a black box that gives the outcome of the search process in terms of the inputs into search. If the u unemployed workers are the only job seekers and they search with fixed intensity of one unit each, and firms also search with fixed intensity of one unit for each job vacancy, the matching function gives m l m (u, )
(7)
with m standing for the flow of matches, au. The function is usually assumed to be continuous and differentiable, with positive first partial derivatives and negative second derivatives, and to satisfy constant returns to scale (see Petrongolo and Pissarides 2000 for a review). A commonly-used matching function in the theoretical literature, derived from the assumption of uncoordinated random search, is the exponential m l (1ke−ku/v),
k0
(8)
The empirical literature, however, estimates a loglinear (constant elasticity) form, which parallels the Cobb–Douglas production function specification, with the elasticity on unemployment estimated in the range 0.5–0.7. The fact that job matching is pairwise implies that the transition rates of jobs and workers are related Poisson processes. Given au l m, the rate at which workers find jobs is a l m\u. If q is the rate of arrival of workers to vacant jobs, then total job flows q l au, and so q l m\. The equilibrium literature generally ignores individual differences and treats the average rates m\u and m\ as the rates at which jobs and workers, respectively, arrive to each searching worker and vacant job. By the properties of the matching function, E
u ,1
qlm F
G
(9) H
m (θ−", 1) q (θ )
(10)
with qh(θ) 0 and elasticity kη(θ) ? (k1, 0). Here, θ is a measure of the tightness of the market, of the ratio of the inputs of firms into search to the inputs of workers. Similarly, the transition rate of workers is E
alm F
1,
u
G
θq (θ) H
(11)
Search, Economics of By the elasticity properties of q (θ), ca\cθ 0. In the steady state, the inverse of the transition rates, l\q (θ) and 1\θq (θ), are the expected durations of a vacancy and unemployment respectively. The influence of tightness on the transition rates is independent of the level of wage rates. If there are more vacant jobs for each unemployed worker, the arrival rate of workers to the typical vacancy is lower and the arrival rate of job offers to the typical unemployed worker is higher, irrespective of the level of wages. When a worker and a firm meet and are considering whether or not to stay together, they are not likely to take into account the implications of their action for market tightness and the transition rates of other unmatched agents. For this reason, the influence of tightness on the transition rates is known as a search externality. Several papers in the economics of search have explored the efficiency properties of equilibrium given the existence of the externality (see Sect. 4.3.2 and Diamond 1982, Mortensen 1982, Pissarides 1984, Hosios 1990). 3.2 Job Creation A job is an asset owned by the firm and is valued in a perfect capital market characterized by the same riskfree interest rate r. Suppose that in order to recruit a worker, a firm has to bear set-up cost K to open a job vacancy and in addition has to pay a given flow cost c for the duration of the vacancy. The flow cost can be interpreted as an advertising and recruitment cost, or as the cost of having an unfilled position in what might be a complex business environment (which is not modeled). Let V be the value of a vacant position and J the value of a filled one. Reasoning as in the case of the value of a job seeker, U, the Bellman equation satisfied by V is rV lkcjq (θ) (JkV )
(12)
The vacant job costs c and the firm is offered the option to take a worker at rate q (θ). Since the firm has to pay set-up cost K to create a job, it will have an incentive to open a job if (and only if ) V K. The key assumption made about job creation is that the gains from job creation are always exhausted, so jobs are created up to the point where VlK
(13)
Substitution of V from (12) into (13) yields rKjc J l Kj q (θ)
(14)
Competition requires that the expected present discounted value of profit when the worker arrives, the
value of a filled job J, should be just sufficient to cover the initial cost K and the accumulated costs for the duration of the vacancy, interest on the initial outlay rK and ongoing costs c for the expected duration of the vacancy, 1\q (θ). Let productivity be a constant p in all jobs and the wage rate a constant w. With break-up rate λ, the value of a job satisfies the Bellman equation rJ l pkwkλJ
(15)
The flow of profit to the firm is pkw until a negative shock arrives that reduces its value to 0. Replacing J in (14) by its expression in (15) yields the job creation condition pkw rKjc kKk l0 rjλ q (θ)
(16)
Equation (16) determines θ for each w and parallels the conventional labor demand curve. A higher wage rate makes it more expensive for firms to open jobs, leading to lower market tightness. Frictions slow down the arrival of suitable workers to vacant jobs and so the firm incurs some additional recruitment costs. If the arrival rate is infinitely fast, as in Walrasian economics, q (θ) is infinite and the last term in (16) disappears. The assumptions underlying (13) ensure that at the margin, the recruitment cost is just covered.
4. Wage Setting With search frictions, there is no supply of labor that can be equated with demand to give wages. The conventional supply of labor is constant here: there is a fixed number of workers in the market, which is usually normalized to unity, and each supplies a single unit of labor. In search equilibrium there are local monopoly rents. Firms and workers who are together can start producing immediately. If they break up, they can start producing only after they find another partner through an expensive process of search. The value of the search costs that they save by staying together correspond to a pure economic rent; they could be taken away from them and they would still stay together. Wages need to share those rents. Two approaches dominate in the literature. The first and more commonly-used approach employs the solution to a Nash bargaining problem (see Diamond 1982, Pissarides 2000). The Nash solution allocates the rents according to each side’s ‘threat points.’ The threat points in this case are the returns from search, and the Nash solution gives to each side the same surplus over and above their expected returns from search. Generalizing this solution concept, wages are determined such that the net gain accruing to the 13763
Search, Economics of worker from the match, WkU, is a fixed proportion β of the total surplus, the sum of the worker’s and the firm’s surplus, JkV. This sharing rule can be obtained as the maximization of the product (WkU ) β ( JkV )"−β, β ? (0, 1)
been found by a number of authors (e.g., Blanchflower and Oswald 1994), although it should be noted that similar wage equations can also be derived from other theoretical frameworks.
(17)
The coefficient β can be given the interpretation of bargaining strength, although strictly speaking, bargaining strength in the conventional Nash solution is given by the threat points U and V (see Binmore et al. 1986 for an interpretation of β in terms of rates of time preference). The second approach to wage determination postulates that the firm ‘posts’ a wage rate for the job, which the worker either takes or leaves. The posted wage may be above the worker’s reservation wage because of some ‘efficiency wage’ type arguments, for example, in order to reduce labor turnover, encourage more effort or attract more job applicants.
4.2 Equilibrium Equation (21) replaces the conventional labor supply curve and closes the system. When combined with the job creation condition (16) it gives unique solutions for wages and market tightness. A variety of intuitive properties are satisfied by this equilibrium. For example, higher labor productivity implies higher wages and tightness; higher unemployment income implies higher wages but lower tightness. It remains to obtain the employment rate in equilibrium. The labor force size is fixed, so by appropriate normalizations, if at some time t unemployment is ut, employment is 1kut. In a short time interval δt, atutδt workers are matched and λ (1kut) δt workers lose their jobs. Given that at l θtq (θt), the evolution of unemployment is given by
4.1 Bargaining The value of an employed worker, W, satisfies the Bellman eqn. rW l wkλ (WkU )
(18)
The worker earns wage w and gives up the employment gain WkU when the negative shock arrives. No worker has an incentive to quit into unemployment or search for another job whilst employed, for as long as w b, which is assumed. With (15) and (17) in place, the Nash bargaining solution to (18) gives the sharing rule WkU l β (WkUjJkV ).
l (1kβ ) bjβ [ pk(rjλ) Kj(rKjc) θ ]
(20) (21)
There is a premium on the reservation wage which depends on the worker’s bargaining strength and the net surplus produced. Wages depend positively on unemployment income and the productivity of the job, the first because of the effect that unemployment income has on the cost of unemployment and the second because of the monopoly rents and bargaining. Wages also depend on market tightness. In more tight markets they are higher, because the expected duration of unemployment in the event of disagreement is less. Empirical evidence supporting this wage equation has 13764
Dividing through by δt and taking the limit as δt yields uc l λ (1ku)kθq (θ) u
(22) 0 (23)
Because the solution for θ is independent of u, this is a stable differential equation for unemployment with a unique equilibrium ul
(19)
Making use of the value equations and sharing rule yields w l rUjβ ( pk(rjλ)KkrU )
ut+δt l utjλ (1kut) δtkθtq (θt) utδt
λ λjθq (θ)
(24)
Equation (24) is often referred to as the Beveridge curve, after William Beveridge who first described such a ‘frictional’ equilibrium. Plotted in space with vacancies on the vertical axis and unemployment on the horizontal is a convex-to-the origin curve. In the early literature the impact of frictions on the labor market was measured by the distance of this curve from the origin (see also Pissarides 2000, Blanchard and Diamond 1989).
4.3 Wage Posting Wage posting is an alternative to wage bargaining, with different implications for search equilibrium. The firm posts a wage for the job and the worker who searches it either takes it or leaves it. Three different models of wage posting are examined.
Search, Economics of 4.3.1 Single offers, no prior information. Workers search sequentially one firm at a time, they discover the firm’s offer after they have made contact and they have to accept it or reject it (with no recall) before they can sample another firm. The worker who contacts a firm that posts wage wi has two options. Accept the wage offer of the firm and enjoy expected return Wi, obtained from (18) for w l wi, or reject it and continue search for return U. The firm that maximizes profit chooses wi subject to Wi U. Since the firm will have no incentive to offer the worker anything over and above the minimum required to make workers accept its offer, in this model wages are driven to the worker’s reservation wage, wi l b (Diamond 1971). In terms of the bargaining solution, the ‘Diamond’ (see (20), (21)) equilibrium requires β l 0. The model can then be solved as before, by replacing β by 0 in the job creation condition and wage equation. Tightness, and consequently unemployment, absorb all shocks other than those operating through unemployment income, which also change wages. (There is a paradox in this model, in that if wages are equal to unemployment income no worker will have an incentive to search.)
4.3.2 Competitie search. The next model increases the amount of information that workers have before they contact the firm (Moen 1997). Workers can see the wage posted by each firm but because of frictions they cannot be certain that they will get the job if they apply. If the probability of a job offer across firms is the same, all workers will apply for the highest-wage job. Queues will build up and this will reduce the probability of an offer. In equilibrium, the wage offer and queue characterizing each job have to balance each other out, so that all firms get applicants. The length of the queue is derived from the matching process. The firm that posts wage wi makes an offer on average after 1\q (θi) periods and a job applicant gets an offer on average after 1\θiq (θi) periods. Implicit in this formulation is the assumption that more than one firm offers the same wage, and firms compete for the applicants at this wage. Workers apply to only one job at a time. Suppose now there is a firm, or group of firms, such that when workers join their queue they derive expected income stream UF , the highest in the market. The constraint facing a firm when choosing its wage offer is that the worker who applies to it derives at least as much expected utility as UF . The expected profit of the firm that posts wage wi solves rVi l kcjq (θi) ( JikVi)
(25)
rJi l pkwikλJi
(26)
The worker’s expected returns from applying to the firm posting this wage satisfy the system of equations rUi l bjθiq (θi) (WikUi) rW l w kλ (W kUz ) i
i
i
(27) (28)
The firm chooses wi to maximize Vi subject to Ui UF . The first-order maximization conditions imply that all firms offer the same wage, which satisfies WkU l
η ( JkV ) 1kη
(29)
Comparison of (29) with (19) shows that the solution is indeed similar to the Nash solution but with the share of labor given by the (negative of the) elasticity of q (θ), (which is equal to the unemployment elasticity of the underlying matching function). The rest of the model can be solved as in the Nash case. There is a special significance to the share of labor obtained in this formulation. In the case where wages are determined according to the Nash rules, the firm and worker choose the wage after they meet, so it is unlikely that they will internalize the search externalities; they do not take into account the effect of their choices on the transition rates of unmatched agents. It can be shown that with constant returns to scale there is a unique internalizing rule which requires β l η, the solution of the wage posting model considered here (Hosios 1990, Pissarides 1984, 2000). For this reason, this particular wage posting model is often called the competitie search equilibrium. The key assumption that gives efficiency is the relaxation of the informational restrictions on workers, which lets them know both the firm’s wage offer and the length of the queue associated with it.
4.3.3 Wage differentials. The third version of the wage posting model relaxes the assumption that workers have the choice of at most one wage offer at a time but does not allow workers knowledge of the wage offer before they apply (Burdett and Judd 1983, Burdett and Mortensen 1998, Montgomery 1991). The easiest way to introduce this is to allow workers the possibility of search on the job, i.e., to let them continue looking for a better job after they accepted one. Suppose for simplicity that job offers arrive to searching workers at the same rate a, irrespective of whether they are employed or unemployed. Suppose also that job search is costless (except for the time cost) and job changing is costless. Then, if a worker is earning w now and another offer paying wh comes along, the worker accepts the new offer if (and only if) wh w. The worker’s reservation wage is the current wage. 13765
Search, Economics of Unemployment pays b, so no firm can pay below b and attract workers. Consider a firm that pays just above b. Anyone applying for a job from the state of unemployment will accept its offer, but no one else will. Employed job seekers will have no incentive to quit their jobs to work at a wage close to b. In addition, this firm’s workers will be quitting to join other firms, which may be paying above b. So a firm paying a low wage will have high turnover and will be waiting long before it can fill its vacancies. A firm paying a high wage will be attracting workers both from unemployment and from other firms paying less than itself. Moreover, it will not be losing workers to other firms. So high-wage firms will have fewer vacant positions. Now suppose the two firms have access to the same technology. The low-wage firm enjoys a lot of profit from each position, but has a lot of vacant positions. The high-wage firm enjoys less profit from each position, but has them filled. It is possible to show under general conditions that high wage and low wage firms will co-exist in equilibrium. Burdett and Mortensen (1998) show this by assuming that there is a distribution of wage offers for homogeneous labor, F (w), and demonstrating that (a) no two firms will offer the same wage, (b) firms can choose any wage between the minimum b and a maximum w- , and enjoy the same profit in the steady state. The maximum is given by E
w` l pk F
λ ajλ
# ( pkb) G
(30)
H
where as before p is the productivity of each worker and λ the job destruction rate. The interesting result about wage posting in this model is that once the assumption that the worker can only consider one offer at a time is relaxed, a distribution of wage offers for homogeneous labor arises. The distribution satisfies some appealing properties. As a 0, the upper support tends to w- l b, the Diamond solution. At the other extreme, as a _, it can be shown that all wage offers converge to p, the competitive solution. Intuitively, a l 0 maximizes the frictions suffered by workers and a l _ eliminates them.
5. Job Destruction A body of empirical literature shows that there is a lot of ‘job churning,’ with many jobs closing down and new ones opening to take their place. It is also found that both the job creation and job destruction rates, especially the latter, vary a lot over the cycle (see Leonard 1987, Dunne et al. 1989, Davis et al. 1996). This is an issue addressed by search theorists. Returning to the model with a Nash wage rule, the job 13766
creation flow is the matching rate m(u, ). The job creation rate is defined as the job creation flow divided by employment, m(u, )\(lku), which is m (u, ) u l θq (θ) 1ku 1ku
(31)
Since both θ and u are endogenous variables of the model that respond to shocks, the model predicts a variable job creation rate that can be tested against the data (see Mortensen and Pissarides 1994, Cole and Rogerson 1996). But the job destruction flow is λ (1ku), so the job destruction rate is a constant λ, contrary to observation. Two alternative ways of making it variable are considered.
5.1 Idiosyncratic Shocks In the discussion so far, the productivity of a job is p until a negative shock arrives that reduces it to zero. Generalizing this idea, suppose that although initially the productivity is p, when a shock arrives it changes it to some other value px. The component p is common to all jobs but x is specific to each job. It has distribution G (x) in the range [0, 1]. New jobs start with specific productivity x l 1; over time shocks arrive at rate λ that transform this productivity to a value between 0 and 1, according to the distribution G. The firm has the choice of either continuing to produce at the new productivity, or closing the job down. The idea is that initially firms have a choice over their product type and technique and choose the combination that yields maximum productivity. Over time techniques are not reversible, so if the payoffs from a given choice change, the firm has the choice of either continuing in the new environment or destroying the job. As in the case where workers are faced with take-itor-leave-it choices from a given wage distribution, Mortensen and Pissarides (1994) show that when the firm has the choice of either taking a productivity or leaving it, its decision is governed by a reservation productivity, denoted R. The reservation productivity depends on all the parameters of the model. Profit maximization under the Nash solution to the wage bargain implies that both workers and firms agree about the optimal choice of R, i.e., which jobs should be closed and which should continue in operation. With knowledge of R, the flow of job closures generalizes to λG (R) (1ku), the fraction of jobs that get shocks below the reservation productivity. The job destruction rate then becomes λG (R), which responds to the parameters of the economy through the responses of R to shocks. An interesting feature of the job creation and job destruction rates, which conforms to observation, is
Search, Economics of that because the job creation rate depends on unemployment, which is a slow-moving variable, whereas the job destruction rate does not depend on it, the job destruction rate is more volatile than the job creation rate. For example, a rise in general productivity p, associated with a positive cyclical shock, reduces job destruction immediately by reducing the reservation productivity, but increases job creation at first and then reduces it, as tightness rises at first but unemployment falls in response.
5.2 Technological Progress and Obsolescence Another way of modeling job destruction borrows ideas from Schumpeter’s theory of growth through creative destruction (see Aghion and Howitt 1994, Caballero and Hammour 1994). Jobs are again created at the best technology but once created, their technology cannot be updated. During technological progress, the owners of the jobs have the option of continuing in production with the initial technology or closing the job down and opening another, more advanced one. As new jobs are technologically more advanced, the wage offers that workers can get from outside improve over time. There comes a time when the worker’s outside options have risen sufficiently to render the job obsolete. Formally, the model can be set up as before, with the technology of the job fixed at the frontier technology at creation time and wages growing over time because of growth in the returns from search. The value of a job created at time 0 and becoming obsolete at T is J l !
&! e T
−(r+λ)t
[ p (0)kw (t )] dt
(32)
As before, r is the discount rate and λ is the arrival rate of negative shocks that may lead to earlier job destruction. p (0) is the initial best technology and w (t) the growing wage rate. The job is destroyed when w (t) reaches p (0); i.e., the job life that maximizes J is ! defined by w (T *) l p (0). A useful restriction to have when there is growth is to assume that both unemployment income and the cost of recruitment grow at the exogenous rate of growth of the economy. As an example, consider the wage eqn. (21) with the restriction K l 0 and b (t) l begt, c (t) l cegt, where g is the rate of growth of the economy. T * then satisfies egT* l
(1kβ ) p (1kβ) bjβcθ
(33)
Jobs are destroyed more frequently when growth is faster, when unemployment income is higher and when the tightness of the market is higher.
Job destruction in this model has two components, the jobs destroyed because of the arrival of shocks, λ (1ku), and those destroyed because of obsolescence. The latter group was created T * periods earlier and survived to age T *, θq (θ) ue−λT* (note that in the steady state the job creation flow is a constant θq (θ) u). Therefore, the job destruction rate now is λjθq (θ) ue−λT*\ (1ku), which varies in response to changes in T* but also in response to changes in the job creation rate θq (θu)\(1ku). See also: Behavioral Economics; Consumer Economics; Consumption, Sociology of; Economics: Overview; Information, Economics of; Labor Markets, Labor Movements, and Gender in Developing Nations; Labor Supply; Market Research; Market Structure and Performance; Stigler, George Joseph (1911–91); Transaction Costs and Property Rights; Wage Differentials and Structure; Work, Sociology of
Bibliography Aghion P, Howitt P 1994 Growth and unemployment. Reiew of Economic Studies 61: 477–94 Binmore K G, Rubinstein A, Wolinsky A 1986 The Nash bargaining solution in economic modelling. Rand Journal of Economics 17: 176–88 Blanchard O J, Diamond P A 1989 The Beveridge curve. Brookings Papers on Economic Actiity 1: 1–60 Blanchard O J, Diamond P A 1990 The cyclical behavior of the gross flows of US workers. Brookings Papers on Economic Actiity 2: 85–155 Blanchflower D G, Oswald A J 1994 The Wage Cure. MIT Press, Cambridge, MA Burdett K, Judd K 1983 Equilibrium price distributions. Econometrica 51: 955–70 Burdett K, Mortensen D T 1998 Wage differentials, employer size, and unemployment. International Economic Reiew 39: 257–73 Caballero R J, Hammour M L 1994 The cleansing effect of recessions. American Economic Reiew 84: 1350–68 Cole H L, Rogerson R 1999 Can the Mortensen–Pissarides matching model match the business cycle facts? International Economic Reiew 40: 933–59 Davis S J, Haltiwanger J C, Schuh S 1996 Job Creation and Destruction. MIT Press, Cambridge, MA Devine T J, Kiefer N M 1991 Empirical Labor Economics: The Search Approach. Oxford University Press, Oxford Diamond P A 1971 A model of price adjustment. Journal of Economic Theory 3: 156–68 Diamond P A 1982 Wage determination and efficiency in search equilibrium. Reiew of Economic Studies 49: 217–27 Dunne T, Roberts M J, Samuelson L 1989 Plant turnover and gross employment flows in the manufacturing sector. Journal of Labor Economics 7: 48–71 Hosios A J 1990 On the efficiency of matching and related models of search and unemployment. Reiew of Economic Studies 57: 279–98 Leonard J S 1987 In the wrong place at the wrong time: The extent of frictional and structural unemployment. In: Lang K, Leonard J S (eds.) Unemployment and the Structure of Labor Markets. Basil Blackwell, New York
13767
Search, Economics of McCall J J 1970 Economics of information and job search. Quarterly Journal of Economics 84: 113–26 Moen E R 1997 Competitive search equilibrium. Journal of Political Economy 105: 385–411 Montgomery J 1991 Equilibrium wage dispersion and interindustry wage differentials. Quarterly Journal of Economics 106: 163–79 Mortensen D T 1982 The matching process as a noncooperative\bargaining game. In: McCall J J (ed.) The Economics of Information and Uncertainty. University of Chicago Press, Chicago, IL Mortensen D T, Pissarides C A 1994 Job creation and job destruction in the theory of unemployment. Reiew of Economic Studies 61: 397–415 Mortensen D T, Pissarides C A 1999a Job reallocation, employment fluctuations, and unemployment. In: Woodford M, Taylor J (eds.) Handbook of Macroeconomics. North-Holland, Amsterdam Mortensen D T, Pissarides C A 1999b New developments in models of search in the labor market. In: Ashenfelter O, Card D (eds.) Handbook of Labor Economics. North-Holland, Amsterdam Petrongolo B, Pissarides C A 2000 Looking into the black box: A survey of the matching function. Journal of Economic Literature 39: 390–431 Phelps E S et al. 1970 Microeconomic Foundations of Employment and Inflation Theory. Norton, New York Pissarides C A 1984 Efficient job rejection. Economic Journal 94: 97–108 Pissarides C A 1985 Short-run equilibrium dynamics of unemployment, vacancies, and real wages. American Economic Reiew 75: 676–90 Pissarides C A 2000 Equilibrium Unemployment Theory, 2nd edn. MIT Press, Cambridge, MA Stigler G J 1961 The economics of information. Journal of Political Economy 69: 213–25
C. A. Pissarides
Second Language Acquisition Humans are all born with the capacity to learn and to use a language; but they are not born with a language. It is not part of human genetical endowment that ‘horse’ means ‘equine quadruped,’ that the past tense is marked by ‘-ed,’ or that the negation follows the finite verb; this knowledge must be derived from the input with which the learner is confronted. The ways which lead this innate language faculty to the knowledge of a particular linguistic system vary considerably, depending on factors such as age, nature of input and whether this task is undertaken for the first time (‘first language acquisition,’ FLA) or not (‘second language acquisition,’ SLA). SLA is not a homogeneous phenomenon, for at least two reasons. First, it need not wait until the learner has completed FLA; hence, there is a continuous transition from bilingual FLA, in which a child is exposed more or less simultaneously to two systems from birth, to the 13768
adult’s struggles with a new kind of linguistic input. Second, there is a wide range of ways in which the human language faculty gains access to a second linguistic system, ranging from metalinguistic description, as in traditional Latin classes, to language learning by everyday communication, as in the case of a foreign worker. In the history of mankind, explicit teaching of a language is a relatively late phenomenon, and untutored learning was, and probably still is, the most common case; but due to its practical importance, SLA in the classroom still dominates research. Linguists and laymen alike tend to consider children’s way to their mother tongue to be the most important type of language acquisition. This view seems most natural; but it leads easily to a distorted picture of how the human language faculty functions, and what its typical manifestations are. FLA is a very complex mixture of cognitive, social and linguistic developments, and it is not easy to isolate its purely linguistic components. The acquisition of the English tense and aspect system, for example, not only requires the learning of a particular mapping of forms and meanings, but also the development of the concept of time itself. Moreover, most people learn more than one language, albeit to different degrees of perfection. Therefore, the normal manifestation of the human language faculty is a ‘learner variety,’ i.e., a linguistic system which comes more or less close to the linguistic habits of a particular social group. In a child’s case, the final learner variety is usually a ‘perfect replication’ of these habits; children who grow up in multilingual communities often achieve two or even three such perfect replications. Adults who set out to learn another language hardly ever reach a stage where they speak like those from whom they learn; their ‘learner varieties’ normally fossilize at an earlier stage. This does not mean that their final learner variety is less of a language, or less efficient; there is no reason to assume that a linguistic system which says ‘He swam yesterday’ is a superior manifestation of the human language faculty than a system which says ‘He swimmed yesterday’ or even ‘He swim yesterday.’ It is just the way the English do it, and deviations from their norms are stigmatized. If the study of language acquisition, and of SLA in particular, should inform us about the nature of the human language faculty, then it must not focus on issues of perfect replication and why it fails sometimes, but try to clarify how the human language faculty deals under varying conditions with particular forms of linguistic input to which it has access. The first step to this end is to isolate the crucial factors which play a role in this process, and to look at ways they can vary. The second step is to investigate what happens under varying constellations. The final step is to draw generalizations from these findings and to turn them into a theory not just of language acquisition, but the nature of human language itself (Klein 1986).
Second Language Acquisition The picture which research on SLA offers at the time of writing is much less systematic. As with so many other disciplines, it has its origin in practical concerns; researchers were looking for scientific ways to improve foreign language teaching, and this seems impossible without a deeper understanding of the principles of SLA. Therefore, most empirical work in this field is still in the classroom. A second source of inspiration was research on FLA, which started much earlier and therefore set the theoretical and methodological stage. More recently, work in theoretical linguistics has increasingly influenced research on SLA. These and other influences, for example from cognitive and social psychology, resulted in a very scattered picture of theories, methods and findings. Rather than reviewing this research, the following discussion will concentrate on three key issues (useful surveys are found in Ellis 1994, Ritchie and Bhatia 1996, Mitchell and Myles 1998, Braidi 1999).
searcher with a simple and clear design for empirical work. There is a yardstick against which the learners’ production and comprehension can be measured: the target language, or actually what grammar books and dictionaries say about it. What is measured is the differences between what learners do and what the set norm demands. Therefore, the dominant method in SLA research was, and is, error analysis: Learners’ errors are marked and then either counted and statistically analyzed, or they are interpreted individually (Corder 1981, Ellis 1994, pp. 561–664). There are two problems with this perspective. First, it does not tell us what learners do but what they are not able to do. Second, its results reflect not just the principles according to which the human language faculty functions, but the efficiency of a particular teaching method. Therefore, this approach may be of eminent importance to the language teacher, but it is of limited value if we want to understand the nature of human language.
1. SLA and Foreign Language Instruction The pedagogical background of SLA research has led naturally to a particular view on SLA, for which two assumptions are constitutive: (a) There is a well-defined target of the acquisition process—the language to be learned. This target language is a clearly fixed entity, a structurally and functionally balanced system, mastered by those who have learned it in childhood, and more or less correctly described in grammars and dictionaries. (b) SLA learners miss this target at varying degrees and in varying respects—they make errors in production as well as in comprehension, because they lack the appropriate knowledge or skills. This is the target deiation perspectie. It is the teacher’s task to erase, or at least to minimize, the deviations; it is the researcher’s task to investigate which ‘errors’ occur when and for which reasons. As a consequence, learners’ performance in production or comprehension is not studied very much in its own right, as a manifestation of learning capacity, but in relation to a set norm; not in terms of what learners do, but in terms of what they fail to do. The learners’ utterances at some time during the process of acquisition are considered to be more or less successful attempts to reproduce the structural properties of target language utterances. Learners try to do what the mature speaker does, but do it less well. Three reasons make the target deviation perspective so natural and attractive, in fact, almost self-evident. First, it is the natural perspective of the language teacher: language teaching is a normative process, and the teacher is responsible for moving students as closely to some norm as possible. Second, it is also the natural perspective of all of those who had to learn a second language in the classroom—and that means, also, of practically every language researcher. Third, the target deviation perspective provides the re-
2. FLA and SLA Experience shows that FLA normally leads to ‘perfect command’ of the target language, whereas SLA hardly ever does. Why this difference? Is perfect attainment of a second language possible at all? Does the learning process only stop at an earlier point, or does it follow different principles? The last question has found two opposite answers. The identity hypothesis, advocated by many researchers in the early 1970s, claims that the underlying processes are essentially the same across all types of acquisition. Under this view, the fact that the learner already knows a language plays no role: there is no transfer from the ‘source language’ (Odlin 1989). Evidence came mainly from the order in which certain grammatical phenomena, such as inflectional morphemes or the position of negation, are acquired. It turned out, however, that these similarities are quite isolated; there are hardly any supporters of the identity hypothesis anymore. Under the opposite view, it is mainly structural differences between source and target language that cause problems for learners. This contrastie hypothesis has given rise to a number of contrastive grammars for pedagogical purposes. But while there are many clear cases in which learners’ first language interferes in the learning process, structural contrasts can at best account for some properties of the acquisitional process. In acquisition outside the classroom, for example, all learners regularly develop a particular type of ‘learner variety’ which is essentially independent of source and target language (see Klein and Perdue 1997). The net result of thirty years of research is simply that there are similarities as well as dissimilarities. The varying success in final attainment could be due to (a) age differences, or (b) to the fact that there is already a language which blocks the acquisition of a 13769
Second Language Acquisition second language. The second possibility is ruled out by the fact that school age children normally have no problem in learning a second language to perfection; hence, the varying success must be an age effect. Apparently, the capacity to learn a language does not disappear, but it deteriorates with age. Since this capacity is stored in the brain, it seems plausible to assume that changes in the brain are responsible for the age effect. The clearest statement of this view is Lenneberg’s theory of a biologically fixed ‘critical period,’ during which the brain is receptive for language; it ranges approximately from birth to puberty. After this period, linguistic knowledge can only be learned in a different form, roughly like the knowledge of historical or geographical facts (Lenneberg 1967). This theory has the seductive charm of simple solutions, and hence has been welcomed with great enthusiasm. But as far as is known, all potentially relevant changes in the brain occur in the first four years of life, rather than around puberty. Moreover, all available evidence shows that the capacity to learn a new language deteriorates only gradually; there is no clear boundary at puberty or at any other time. Finally, it could be shown that ‘perfect attainment’ is perhaps rare but definitely possible after puberty (see Birdsong 1999). It appears, therefore, that there is no clear biological threshold to language acquisition; the age effect is due to a much wider array of factors (Singleton 1989).
3. SLA and Theoretical Linguistics The apparent ease and speed with which children, despite deviant and insufficient input, become perfect speakers of their mother tongue has led Noam Chomsky and other generative grammarians to assume that a great deal of the necessary linguistic knowledge is innate. Since every newborn can learn any language, this innate knowledge must be universal, and it is this ‘universal grammar’ (UG) which is the proper object of linguistic theory. Since languages also differ in some respects (otherwise, SLA would be superfluous), the competence of mature speakers is supposed to include a ‘peripheral part,’ which includes all idiosyncratic properties and must be learned by input analysis, and a ‘core.’ The core consists of a number of universal principles—the UG. Initially, these principles include a number of ‘open parameters,’ i.e., variable parts which must be fixed by input analysis. Chomsky made this point only for FLA, and only in the mid 1980s was the question raised whether UG is still ‘accessible’ in SLA. A number of empirical studies tested the potential ‘resetting’ of various parameters. Spanish, for example, allows the omission of a subject pronoun, a property which is structurally linked to other features such as a relatively rich inflectional word order and relatively free word order; these and other properties 13770
form the ‘pro-drop parameter.’ English children have set this parameter the opposite way when acquiring their language. Are adult English learners of Spanish able to ‘reset’ it, or do they have to learn all of these properties by input analysis? Results are highly controversial (see e.g., Eubank 1991, Epstein et al. 1997). Although inspired by theoretical linguistics, most empirical research in this framework keeps the traditional ‘target deviation perspective’; with only a few exceptions, it deals with acquisition in the classroom, hence reflecting the effects of teaching methods. Moreover, there is no agreement on the definition of the parameters itself; in fact, more recent versions of generative grammar have essentially abandoned this notion. Finally, it is an open issue as to which parts of linguistic knowledge form the core and which parts belong to the periphery, and hence must be learned from the input. These language-specific parts clearly include the entire lexicon, the inventory of phonemes, inflectional morphology, all syntactic properties in which languages can differ—in short, almost everything. It seems more promising, therefore, to look at how learners construct their learner varieties by input analysis.
4. Learner Varieties The alternative to the target deviation perspective is to understand the learners’ performance at any given time as an immediate manifestation of their capacity to speak and to understand: form and function of these utterances are governed by principles, and these principles are those characteristic of the human language faculty. Early attempts in this direction are reflected in notions such as ‘interlanguage,’ ‘approximate systems’ and so on. Since the 1980s, most empirical work on SLA outside the classroom has taken this ‘learner variety perspective’ (von Stutterheim 1986, Perdue 1993, Dietrich et al. 1995). In its most elaborate form, it can be characterized by three key assumptions (Klein and Perdue 1997). (a) During the acquisitional process, learners pass through a series of learner varieties. Both the internal organization of each variety at a given time, as well as the transition from one variety to the next, are essentially systematic in nature. (b) There is a small set of principles which are present in all learner varieties. The actual structure of an utterance in a learner variety is determined by a particular interaction of these principles. The kind of interaction may vary, depending on various factors, as the learner’s source language. With ongoing input analysis, the interaction changes. Picking up some component of noun morphology from the input, for example, may cause the learner to modify the weight of other factors to mark the grammatical status of a noun phrase. Therefore, learning a new feature is not adding a new piece of puzzle which the learner has to
Second World War, The put together. Rather, it entails a sometimes minimal, sometimes substantial reorganization of the whole variety, where the balance of the various factors approaches the balance characteristic of the target language successively. (c) Learner varieties are not imperfect imitations of a ‘real language’ (the target language), but systems in their own right. They are characterized by a particular lexical repertoire and by a particular interaction of structural principles. Fully developed languages, such as Spanish, Chinese or Russian, are only special cases of learner varieties. They represent a relatively stable state of language acquisition—that state where learners stop learning because there is no difference between their variety and the variety of their social environment, from which they get input. Thus, the process of language acquisition is not to be characterized in terms of errors and deviations, but in terms of the twofold systematicity which it exhibits: the inherent systematicity of a learner variety at a given time, and the way in which such a learner variety evolves into another one. If we want to understand the acquisitional process, we must try to uncover this two fold systematicity, rather than look at how and why a learner misses the target. See also: First Language Acquisition: Cross-linguistic; Foreign Language Teaching and Learning; Language Acquisition; Language Development, Neural Basis of
Bibliography Birdsong D (ed.) 1999 Second Language Acquisition and the Critical Period Hypothesis. Erlbaum, Mahwah, NJ Braidi S M 1999 The Acquisition of Second Language Syntax. Arnold, London Corder P 1981 Error Analysis and Interlanguage. Oxford University Press, Oxford, UK Dietrich R, Klein W, Noyau C 1995 Temporality in a Second Language. Benjamins, Amsterdam Ellis R 1994 The Study of Second Language Acquisition. Oxford University Press, Oxford, UK Epstein S, Flynn S, Martohardjono G 1997 Second language acquisition. Theoretical and experimental issues in contemporary research. Behaioural and Brain Sciences 19: 677–758 Eubank L (ed.) 1991 Point–Counterpoint Uniersal Grammar in the Second Language. Benjamins, Amsterdam Klein W 1986 Second Language Acquisition. Cambridge University Press, Cambridge, UK Klein W, Perdue C 1997 The basic variety, or couldn’t natural languages be much simpler? Second Language Research 13: 301–47 Lenneberg E 1967 Biological Foundations of Language. Wiley, New York Mitchell R, Myles F 1998 Second Language Learning Theories. Arnold, London Odlin T 1989 Language Transfer. Cambridge University Press, Cambridge, UK Perdue C (ed.) 1993 Adult Language Acquisition Crosslinguistic Perspecties. Cambridge University Press, Cambridge, UK, 2 Vols
Ritchie W C, Bhatia T (eds.) 1996 Handbook of Second Language Acquisition. Academic Press, New York Singleton D 1989 Language Acquisition: The Age Factor. Multilingual Matters, Clevedon, UK Stutterheim C von 1986 TemporalitaW t in der Zweitsprache (Temporality in second language). De Gruyter, Berlin
W. Klein
Second World War, The 1. The Second World War: A Narratie World War II was an event of massive significance. For at least fifty years after its end in 1945 it continued to condition societies and ideas throughout the world. Much of the politics of the second half of the twentieth century can be read as occurring in an ‘after-war’ context. The war exacted a death toll of at least 60 million, and probably tens of millions more than that (figures for China and the rest of Asia are mere guesses and the USSR’s sacrifice has risen from seven to 20 to 29 or more million as time has passed, circumstances varied, and the requirements of history altered). A majority of the casualties were civilians, a drastic change from World War I when some 90 percent of deaths were still occasioned at the fronts. Moreover, the invention of the atom bomb during the war and its deployment by the USA at Hiroshima and Nagasaki (August 6 and 9, 1945) suggested that, in any future nuclear conflict, civilians would compose 90 percent or more of the victims. When this apparent knowledge was added to the revelations of Nazi German barbarism on the eastern front and the Nazis’ massacre of European Jewry, either in pit killings or when deliberately transported to such death camps as Auschwitz-Birkenau, Treblinka, Sobibor, Chelmno, Belzec, and Majdanek, another casualty of the war seemed to be optimism itself. Certainly at ‘Auschwitz’ and perhaps at Hiroshima, ‘civilization,’ the modernity of the Enlightenment, the belief in the perfectibility of humankind, had led not to hope and life but instead to degradation and death. This more or less open fearfulness, with its automatic resultant linking of a pessimism of the intellect to any optimism of the will among postwar social reformers, may be the grandest generalization that can be made about the meaning of World War II. Big history, however, should not lose sight of microhistory. Actually World War II was fought on many fronts, at different times, for different reasons, and with different effects. In this sense, there was a multiplicity of World War II’s. In September 1939, a war broke out between Nazi Germany and authoritarian Poland. The liberal democratic leadership of Britain and France intervened 13771
Second World War, The saying that they would defend Poland, although in practice they made do with ‘phoney war’ until the Nazi forces took the military initiative, first in Denmark and Norway, and then in the Low Countries and France in April–May 1940. In June 1940, Fascist Italy entered the war as had been envisaged in the ‘Pact of Steel’ signed with its Nazi ally in May 1939. War now spread to the Italian empire in North and East Africa and, from October 1940, to the Balkans. Italian forces botched what they had hoped would be a successful Blitzkrieg against Greece and the effect over the next year was to bring most of the other Balkan states into the conflict. Often these fragile states dissolved into multiple civil wars, the most complicated and the one of most lasting significance began in Yugoslavia from March–April 1941. On June 22, 1941, the Nazi Germans invaded the Soviet Union, commencing what was, in some eyes, the ‘real’ World War II, and certainly the one that was inspired by the most direct ideological impulse and which unleashed the most horrendous brutality. In the course of the campaign in the east it is estimated that the Germans sacked 1,710 towns and 70,000 villages. During the epic siege of Leningrad from 1941 to 1944, a million or so of the city’s inhabitants starved to death. In their invasion, the Germans were joined by an assortment of anticommunist allies and friends, including military forces from authoritarian Romania and Fascist Italy. Many Lithuanians, Latvians, and Estonians, and quite a few anti-Soviet elements within the USSR (Ukrainian nationalists, people from the Caucasus, and others) acted an auxiliaries of Nazi power. The Nazis were even embarrassed by a ‘Russian’ army under General A. A. Vlasov, willing to fight on their side against Stalin and his system. Volunteers also came from pro-fascist circles in France, and from Spain and Portugal, states ruled by clerical and reactionary dictators who hated communists but were not fully reconciled to the radical thrust of much Nazi-fascist rhetoric and some Nazifascist policy. On December 7, 1941, the war widened again when the Japanese airforce attacked the American Pacific fleet at anchor at Pearl Harbor in Hawaii. In the following weeks, the Japanese army and navy thrust south and east, dislodging the British from Singapore by January 15, 1942. They went on to seize the Philippines and Dutch East Indies (later Indonesia) and were in striking distance of Australia before being checked at the Battle of the Coral Sea in June 1942. They simultaneously continued the terrible campaign that they had been waging in China since 1937 (or rather since 1931, when they had attacked Manchukuo or Manchuria). In their special wars, the militarist Japanese leadership tried to throw off what they called the imperialist yoke of US capitalism and the older ‘white’ metropolitan empires. The purity of their antiimperial motives was damaged, however, by their own commitment to empire and by their merciless killing of 13772
Asians. Moreover, as in Europe, their invasions often touched off varieties of civil war provoked by the highly complex stratification of society in the region. At the forefront of such campaigns were often local nationalists who imagined communities subservient neither to European powers nor the Japanese. On December 11, 1941, Germany and Italy, loyal to the terms of the anti-Comintern pact, had also declared war on the USA, somewhat ironically so, given that Japan, checked by military defeats in an unofficial war against the USSR at Khalkin Gol and Nomonhan in 1938–9, had now engaged with other enemies. The Italians would find out the implications in North Africa, the Germans after the Allied invasion of France on ‘D-Day’ (June 6, 1944), as well as in Italy from September 8, 1943 (the Fascist dictator, Benito Mussolini, overthrown on July 25, was thereafter restored as a sort of German puppet in northern Italy; allied forces moved slowly up the peninsula from the south, liberating Rome on June 4, 1944 but only reaching Milan at the end of the war in late April 1945). Of the participants in the war, the USA, once fully mobilized, possessed the biggest and most productive economy, and was therefore of crucial importance in the eventual defeat of the anti-Comintern powers. The campaign the Americans fought with the most passion, and with an evident racism of their own, was the war against Japan. In another sense, the USA had a relatively soft war, not disputed on its own territory and not requiring the sort of physical or spiritual sacrifice obligatory from most other combatants. The USA’s special World War II was not really a visceral one. If the war was fought in many different ways, it is equally true that the variety of conflicts did not all end at the same time and in the same way. The Nazi armies surrendered on May 8, 1945, the Japanese on August 15. But matters were more complicated than that. France’s special World War II would commemorate ‘victory’ from the date of the liberation of Paris on August 25, 1944 (and General Charles De Gaulle would firmly proclaim that Paris and France had liberated themselves). In most of Nazi-fascist occupied Europe and in parts of Asia, partisan movements had never altogether accepted defeat. Communists were invariably prominent in such resistance, even if quite a few still envisaged themselves as fighting as much for the Soviet revolution as for the liberty of their own nation state. Every successive ‘liberation,’ in Europe most frequently coming after the military victory of the Red Army, and in the Pacific that of the USA, had its own special character. Yugoslavia and China were two especially complicated places where the resistance was very strong but where it was contested, not only by the Nazi-fascists and the Japanese but also by local anticommunist and nationalist or particularist forces. The effects and memory of their wars were by definition to be very different from such societies as the USA, Australia, and the UK which did not endure foreign
Second World War, The occupation, though the last, with its severe experience of bombing, was itself different from the other two. In sum, World War II was not just an enormously influential event but also an extraordinarily complicated one. Its complexity has in turn stimulated many passionate and long-lasting debates about its historical meaning.
2. The Causes of War For many years it was customary to argue that World War II as compared to World War I had a simple cause. This was ‘Hitler’s war.’ Mainstream analysis continues to ascribe to the Nazi dictator great responsibility for the invasion of Poland and for the spreading of the war thereafter, and especially for the launching of Operation Barbarossa against the USSR. Nonetheless, from the 1960s, the course of historiography, especially as exemplified in the rise of social history, did not favor a ‘Great Man’ view of the past and tended to urge that even dictators had limits to their free will. As early as 1964, English radical historian A. J. P. Taylor (1964) argued in a book entitled The Origins of the Second World War that the several crises which led up to September 1939 and what he sardonically called the ‘War for Danzig’ needed to be understood in a variety of contexts, including the peace settlements at the end of World War I, the course of German political and social history, the institutionalization of the Russian revolution, with its victorious but feared and paranoid communist and then Stalinist regime, and the lights and shadows of democratic liberalism in Western Europe. Taylor wrote with a flaunted stylistic brilliance and practiced a brittle historical cleverness. He was destined to be misunderstood and often gloried in the misunderstanding. His book thus produced an enormous controversy, the first of many to be sparked by attempts to define the meaning of World War II. At the same time, Taylor’s idiosyncrasies ensured that his work could easily enough be dismissed by those mainstream historians who liked to feel the weight of their commitment to Rankean principles and to make their liking evident. Nonetheless the issues raised by Taylor did not go away. In West Germany, the so-called ‘Fischer controversy,’ sparked by the Hamburg liberal historian Fritz Fischer’s Griff nach der Weltmacht, a massively documented study of German aims during World War I and also published in 1961, raged through the decade. Although Fischer, then and thereafter, wrote almost exclusively about World War I and about imperial Germany, he was read as commenting on World War II and, indeed, on Germany’s divided fate in its aftermath. Two issues were prominent. Was imperial Germany an aggressive power in a way that bore comparison with the Nazi regime during the 1930s?
Was the motive of the German leadership as much domestic as foreign—did they seek foreign adventure and even world war in order to divert the pressure building for social democracy? Was the appalling conflict from 1939 to 1945 caused by the ‘German problem,’ which may have begun as early as 1848 and may have continued after 1945? Fischer’s work was more directly influential than Taylor’s and the issue of the relationship between Innenpolitik and Aussenpolitik fitted neatly into the preoccupations of new social historians who, by the 1970s, were scornfully dismissing the work of ‘old fashioned diplomatic historians.’ For all that, specialist works on the causation of World War II continued to privilege the power of Hitler and acknowledge the ideological thrust of Nazism. English independent Marxist, Tim Mason, may have tried to pursue a Fischerian line in asking how much the diplomatic crises of 1938–9 were prompted by the contradictions of Nazi economic and social policy, but his essays remained at the periphery of most analysis. Rather such firmly Rankean historians as Gerhard Weinberg and Donald Watt assembled evidence which, in their eyes, only confirmed that the war was caused by Hitler. The newest and most authoritative Englishlanguage biographer of the Fu$ hrer, Ian Kershaw, despite his background in social history, does not disagree. Hitler may have been erratic as an executive. Nazi totalitarian state and society, contrary to its propaganda about militant efficiency and a people cheerfully bound into a Volksgemeinschaft, may in practice have often been ramshackle. But the dictator, Kershaw argued, did possess power. Indeed, so allembracing was his will that Germans strove to ‘work towards’ their Fu$ hrer, to accept his ideas and implement his policies before he had fully formulated them. For a generation in the wake of the Fischer controversy, scholarship on Nazism had separated into ‘intentionalists’ (advocates of Great Man history) and ‘functionalists’ (those who preferred to emphasize the role of structures and contexts and who were especially alert to the ‘institutional darwinism’ of the Nazi regime). Now Kershaw, often a historian of the golden mean, seemed to have found a way to resolve and end that conflict. The only major variants on this reaffirmation that Hitler had provoked the war came from certain conservative viewpoints. Some anticommunist historians focused on the Ribbentrop–Molotov pact (August 23, 1939), placing responsibility for the war on ‘Stalin’ and the Russian revolution in what seemed to others a highly tendentious effort to blame the victim. More common was the view expressed most succinctly by Zionist historian Lucy Dawidowicz that the whole conflict was in essence a ‘war against the Jews.’ In this interpretation, German nationalism, German anticommunism, German racism towards Slavs, Nazi repression of the socialist and communist left, none in any sense equated with Hitlerian anti13773
Second World War, The Semitism. Hitler (or, in the variant recently made notorious by Daniel Goldhagen, the Germans) wanted to kill Jews; that was the purpose of the Nazi regime; that was the aim of its wars. In other ex-combatant societies, further examples of local focus are evident. In England, fans of appeasement still existed, the most prominent being John Charmley. For him, the problem with the war lay simply in Britain’s engagement in it. As far as the British empire was concerned, Nazism did not need to be fought and the USSR should never have been an ally. Worst of all was the fact that the USA had dominated the post war world. Charmley has never quite said so, but implication of his work is that the ‘real’ World War II for Britain, and the one it most dramatically lost, was implicitly fought against its American ‘cousins.’ The Asian-Pacific conflict has similarly been subject to historical revision. In the 1960s Gabriel Kolko and other American ‘new leftists’ applied a Marxian model to their nation’s foreign policy, being very critical of the gap between its idealistic and liberal theory and its realist and capitalist practice. They were in their turn duly subjected to withering fire from more patriotic and traditional historians. Nonetheless a consensus grew that, at least in regard to the onset of the American–Japanese war, US policy makers carried some responsibility. By the 1980s, liberal historian John Dower was even urging that the two rivals had been ‘enemies of a kind,’ neither of them guiltless of racism and brutality. In Japan, by contrast, an officially sponsored silence long hung over everything to do with the war. The Ministry of Education was particularly anxious that schoolchildren not be exposed to worrying facts about such terrible events as the massacre in Nanking in 1937, the practice of germ warfare, the exploitation of ‘comfort women,’ and the many other examples of Japanese murder, rape, and pillage in Asia and the Pacific. Nonetheless, a stubborn undercurrent of opinion exemplified in the work of historian Ienaga Saburo continued to contest the Ministry line and, by the 1990s, the Japanese leadership had gone further than ever before in admitting some of the misdeeds of its militarist predecessors. Fully critical history may still not be especially appreciated in Tokyo. But, by the end of the 1990s, Japan was not the only society to behave that way in that world in which the ideology of economic rationalism had achieved unparalleled hegemony backed by what American democratic historian Peter Novick has called ‘bumper sticker’ lessons from the past.
3. The Course of the War In the preceding paragraphs it has not always been possible to keep fully separate discussions about the 13774
coming of World War II from what happened after hostilities commenced. In any case, military history in the pure sense, just like diplomatic history, after 1945 soon lost ground professionally. To most eyes, the military history of the war can swiftly enough be told. Of the anti-Comintern states, Germany and Japan but not Italy, won rapid initial victories, exulting in their respective Blitzkriegs. But their triumphs were always brittle. Germany and its allies may have been good at getting into wars, but their ideologies made it difficult for them thereafter to contemplate any policy except the complete liquidation of the enemy. Hitler and the Japanese imperial and military leadership thus made no attempts to offer compromise from a position of strength, and Mussolini’s occasional flirtation with the idea always involved Nazi sacrifice in their wars, especially that in the east, rather than Italian loss. Nor did the anti-Comintern states make the most of the huge territories, which they had conquered, and the immense material resources that they therefore controlled. Nazi Germany is something of a case study in this regard. In the West, where the war was always gentler, the Nazis found plenty of direct and indirect collaborators. They were thus, for example, able to harness a very considerable proportion of the French economy to their cause. They also started to construct a new economic order that was not utterly unlike some of the developments, which would occur in Western Europe after Nazi-fascism had been defeated. With extraordinary contradiction for a state built on an utter commitment to racial purity, Nazi Germany, already before 1939, needed immigrants to staff its economy. Once the war began, this requirement became still more pressing. One partial solution was to import workers by agreement with its ally Italy and by arrangement with the friendly Vichy regime in France. Not all such French ‘guest-workers’ came unwillingly and not all had especially bad wars. In Germany they joined other, more reluctant, immigrants from the east, who were often little more than slave laborers. Poles and Soviet prisoners of war constituted the majority of these; at first they were frequently worked to death. However, as the Nazi armies turned back and the war settled into one of attrition and retreat, as symbolized by the great defeat at Stalingrad (November 1942–January 1943), the Germans began to treat even laborers from the east in a way that allowed some minimum chance of their survival and which also permitted some tolerable productivity from their labor. The exception was, of course, the Jews, who, from September–October 1941, became the objects of the ‘Final Solution,’ a devotion to murder confirmed by officials who attended the Wannsee conference in January 1942. In terms of fighting a war, the adoption of the policy of extermination was, of course, counterproductive in many senses, among which was the economic. In staffing and fueling the trains, which transported the Jews to the death camps, in the low
Second World War, The productivity of these and the other camps, the Nazis wasted resources needed at the front. Saul Friedlander will write elsewhere in this volume about the debates about the meaning of the Holocaust. It is worth noting in this segment, however, that each combatant society has argued about its particular experience of being visited by the Nazi war machine and therefore of being exposed to collaboration with it. Two important examples occurred in France and the USSR. The spring of 1940 brought disaster to the Third French Republic. Now was the time of what Marc Bloch, one of the founders of the great structuralist historical school of the Annales, a patriotic Frenchman and a Jew, called the ‘strange defeat.’ Military historians have demonstrated that France was not especially inferior in armament to the invading Germans. Rather, France lost for reasons to do with morale and the domestic divisions of French society. As a result, by June 1940 the French state and empire had collapsed. During the next four years, the inheritance of the Third Republic was disputed between the Vichy regime headed by Marshal Pe! tain within the rump of French metropolitan territories, the ‘Free French’ under General Charles De Gaulle, resident in London, and there, however reluctantly and churlishly, dependent on Allied goodwill and finance, and a partisan movement which gradually became more active in the occupied zones. This last was typically divided between communists and other forces, some of which disliked communists as much as they hated the invaders. The years of Axis occupation were thus also the time of the ‘Franco-French civil war,’ with killings and purge trials extending well beyond liberation. The meaning of war, occupation, and liberation in France has been much disputed after 1945. It took a film maker, Marcel Ophuls in Le Chagrin et la pitieT (1971), and an American historian, Robert Paxton, to break a generation of silence about the troubling implications of this period of national history, although Socialist President Franc: ois Mitterrand (1981–95), with his own equivocal experience of Vichy, was scarcely an unalloyed advocate of openness during his term in office. Historian Henry Rousso has brilliantly examined the ‘Vichy syndrome’ and done much to expose some of the obfuscations favored by many different leadership groups in post-1945 French politics. Perhaps his work does not go far enough, however. With its fall, and then with decolonization after 1945, France had lost its political empire, promising to become just another European state. However, this loss was curiously compensated by the rise and affirmation of French culture. In almost every area of the humanities, such French intellectuals as De Beauvoir, Braudel, Barthes, Foucault, Le! vi-Strauss, Lyotard, Baudrillard, and Nora, charted the way to postmodernity. They did so sometimes invoking the pro-Nazi philosopher Martin Heidegger and almost always without much reckoning of the collapse of the French nation state in 1940. The intellectuals of France
were much given to forgetting their nation’s fall, the better to affirm their own rights to cultural imperium. If France provides a case study of the transmutation of defeat into victory, the USSR offers the reverse example of a victor whose people would eventually learn that ‘actually’ they had lost. The nature of the Soviet war effort is still in need of research. Sheila Fitzpatrick, a historian of Soviet social history, has depicted a population by 1939 brutalized and depressed by the tyranny, incompetence, and contradictions of Stalinism. Her work does not really explain, however, how that same populace fought so stubbornly ‘for the motherland, for Stalin.’ No doubt the absurd murderousness of Nazi policies gave them little alternative. No doubt the aid which eventually flowed through from the USA was of great significance. But something does remain unexplained about how ‘Stalin’s Russia’ won its ‘Great Patriotic War.’
4. The Consequences of the War By now very clear, however, are the consequences of the war for the USSR. In the short term the war made the Soviet state the second superpower, the global rival to the USA, and gave Stalin, until his death in 1953, an almost deified status. However, as the postwar decades passed, it became clear that the USSR and its expanded sphere of influence in Eastern Europe were not recovering from the war with the speed being dramatically exemplified in Western Europe and Japan. Indeed, the history of the USSR, at least until Gorbachev’s accession to the party secretaryship in 1985 and, arguably, until the fall of the Berlin Wall and the collapse of communism (1989–91), should best be read as that of a generation who had fought and won its wars (including the terrible domestic campaigns for collectivization as well as the purges of the 1930s). Leaders and people were unwilling or unable to move beyond that visceral experience. After 1945, the USSR became the archetypal place ‘where old soldiers did not die nor even fade away.’ Brezhnev, Chernenko, and their many imitators further down the power structure were frozen into a past that had an ever-diminishing connection with a world facing a new technological and economic revolution. Other ex-combatant societies were more open to change than the USSR but a certain sort of remembering and a certain sort of forgetting can readily enough be located in them, too. Memory proved most threatening in Yugoslavia. There the victorious partisans under Josip Broz Tito seemed for a time to have won a worthwhile victory. The special history of their campaigns against the Nazi-fascist occupiers of their country allowed them claims to independence from the USSR, which they duly exercised after 1948. At the same time the barbarity, during the war, of the collaborationist Croat fascist regime under Ante 13775
Second World War, The Pavelic! , whose savagery embarrassed even the Germans, seemed to suggest that the region was indeed best administered by a unitary state. The fall of communism, however, also brought down a communist Yugoslavia in which such Serb leaders as Slobodan Milosevic, unable to believe in official ideals, increasingly recalled the nationalism of wartime Cetniks rather than the internationalist Marxism once espoused by the partisans. This memory of war and murder justified new wars and new murders, even if with the somewhat ironical result that, by the end of the 1990s, the (Serb) winners of the last war had become losers and the (Slovene, Croat, Bosnian Moslem, and Kosovar) losers had become winners. In Italy, the country with the largest communist party in the West and a polity which had a ‘border idiosyncrasy,’ bearing some comparison with Yugoslavia’s role in the Eastern Bloc, the inheritance of war and fascism similarly possessed peculiar features. Postwar Italy began by renouncing fascism, empire, and war, in 1946 abandoning the monarchy that had tolerated the imposition of Mussolini’s dictatorship and, in 1947, adopting a constitution which made considerable claim that the new Republic would be based on labor. In practice, however, Italy took its place in the Cold War West. Its purging of ex-Fascists was soon abandoned and both the power elites and the legislative base of the new regime exhibited much continuity with their Fascist predecessors. Nonetheless, from the 1960s, an ideology of antifascism was accorded more prominence in a liberalizing society. From 1978 to 1985, Sandro Pertini, an independent socialist who had spent many years in a Fascist jail and been personally involved in the decision to execute Mussolini, became Italy’s president. Widely popular, he seemed an embodiment of the national rejection of the Fascist past. Once again, however, the process of memory was taking a turn and a different useable past was beginning to emerge. Left terrorists in the 1970s had called themselves the new Resistance and declared that they were fighting a Fascist-style state—the governing Christian Democrats were thought to be merely a mask behind which lurked the Fascist beast. The murder of Aldo Moro in 1978 drove Italians decisively away from this sort of rhetoric and, in the 1980s and 1990s, Italians sought instead a ‘pacification’ with the past in which ex-Fascists had as much right to be heard as ex-partisans. Among historians, ‘anti-antiFascists,’ led by Renzo De Felice, the biographer of Mussolini, provided evidence and moral justification for this cause. Media magnate and conservative politician Silvio Berlusconi joined those who agreed that Italy’s World War II had lost its ethical charge. Among the Western European ex-combatants states perhaps the UK was the place where the official myth of the war survived with least challenge. A vast range of British society and behavior was influenced by Britain’s war. The Welfare State, as codified by the 13776
postwar Labor government under Clement Attlee, was explained and justified as a reward for the effort of the British people in the ‘people’s war.’ Wartime conservative Prime Minister, Winston Churchill, despite his many evident limitations, remained a national icon. British comedies from the Goons in the 1950s to Dad’s Army in the 1970s and 1980s to Goodnight Sweetheart in the 1990s were obsessively set in the war. From the alleged acuteness of their secret service activities to the alleged idealism of their rejection of Nazism, the British have constantly sought to preserve the lion’s share of the positives of World War II for themselves. The suspicion of a common European currency and the many other examples of continuing British insularity in turn reflect the British cherishing of the fact that they fought alone against the Nazifascists from 1940 to 1941, and express their associated annoyed perplexity that somehow their wartime sacrifice entailed a slower route to postwar prosperity compared with that of their continental neighbors. Memory after memory, history after history, World War IIs, in their appalling plenitude, still eddy around. As the millennium ended, another historian wrote a major book about the meaning of an aspect of the war, and about the construction of that meaning. Peter Novick’s The Holocaust in American Life (1999) daringly wondered whether the privileging of the Nazi killing of the Jews in contemporary Jewish and even gentile American discourse is altogether a positive. Being a historical victim at one time in the past, he argued cogently, can obscure as well as explain. His caution is timely. It is probably good that World War IIs are with us still; it will be better if the interpretation of so many drastic events can still occasion democratic debate, courteous, passionate, and humble debate, and if we can therefore avoid possessing a final solution to its many problems. See also: Cold War, The; Contemporary History; First World War, The; Genocide: Historical Aspects; Holocaust, The; International Relations, History of; Military History; War: Anthropological Aspects; War: Causes and Patterns; War Crimes Tribunals; War, Sociology of; Warfare in History
Bibliography Bartov O 1986 The Eastern Front 1941–5: German Troops and the Barbarisation of Warfare. St. Martin’s Press, New York Bosworth R J B 1993 Explaining Auschwitz and Hiroshima: History Writing and the Second World War 1945–1990. Routledge, London Bosworth R J B 1998 The Italian Dictatorship: Problems and Perspecties in the Interpretation of Mussolini and Fascism. Arnold, London Calder A 1969 The People’s War: Britain 1939–45, Pantheon, London Charmley J 1993 Churchill: The End of Glory—A Political Biography. Hodder and Stoughton, London
Secondary Analysis: Methodology Dawidowicz L 1986 The War Against the Jews 1933–45, rev. edn. Penguin, Harmondsworth, UK Dower J W 1986 War Without Mercy: Race and Power in the Pacific War. Pantheon, New York Fischer F 1967 Germany’s Aims in the First World War. Chatto and Windus, London Fischer F 1986 From Kaiserreich to Third Reich: Elements of Continuity in German History, 1871–1945. Allen and Unwin, London Fitzpatrick S 1999 Eeryday Stalinism: Ordinary Life in Extraordinary Times: Soiet Russia in the 1930s. Oxford University Press, New York Gorodetsky G 1999 Grand Delusion: Stalin and the German Inasion of Russia. Yale University Press, New Haven, CT Ienaga S 1979 Japan’s Last War: World War II and the Japanese 1931–1945. Australia University Press, Canberra, Australia Hogan M J (ed.) 1996 Hiroshima in History and Memory. Cambridge University Press, Cambridge, UK Kershaw I 1999 Hitler 1889–1936: Hubris. W. W. Norton, New York Kershaw I 2000 Hitler 1936–1945: Nemesis. Allen Lane, London Kolko G 1968 The Politics of War: The World and United States Foreign Policy 1943–5. Random House, New York Milward A 1977 War, Economy and Society 1939–1945. University of California Press, Berkeley, CA Novick P 1999 The Holocaust in American Life. Houghton Mifflin, Boston Parker R A C 1990 Struggle for Surial: The History of the Second World War. Oxford University Press, Oxford, UK Rousso H 1991 The Vichy Syndrome: History and Memory in France Since 1944. Harvard University Press, Cambridge, MA Taylor A J P 1964 The Origins of the Second World War, rev. edn. Penguin, Harmondsworth, UK Thorne C 1986 The Far Eastern War: States and Societies 1941–5. Unwin, London Tumarkin N 1994 The Liing and the Dead: The Rise and Fall of the Cult of World War II in Russia. Basic Books, New York Watt D C 1989 How War Came: The Immediate Origins of the Second World War 1938–1939. Pantheon Books, New York Weinberg G L 1994 A World at Arms: A Global History of World War II. Cambridge University Press, Cambridge, UK
R. J. B. Bosworth
Secondary Analysis: Methodology Secondary analysis refers to a set of research practices that involve utilizing data collected by someone else or data that has been collected for another purpose (e.g., administrative records). It is used, to varying degrees, across a wide range of disciplines and throughout the world. Given this breadth, it is not surprising that it has taken on numerous forms. It is also conducted for several distinct reasons. Although not a research methodology, per se, several features distinguish it from other research activities. In turn, these features create opportunities and limitations for the secondary
analyst. Central issues concern: (a) data availability, access, and documentation; (b) maintaining confidentiality and privacy pledges made by primary researchers; and (c) proprietary rights and data ownership.
1. Secondary Analysis as a ‘Methodology’ The use of secondary data in behavioral and social sciences is ubiquitous, appearing in a number of traditional (i.e., quantitative) and untraditional (i.e., qualitative) forms. Despite its rather extended roots in the social and behavioral sciences, it is not widely celebrated as a method. As partial evidence of its relative obscurity, consider the fact that between 1978 and the present, a search of PsychInfo uncovered only 36 articles and books using the keyword ‘secondary analysis.’ Furthermore, the phrase ‘secondary analysis’ appeared in fewer than half of the article or book titles. Books on the topic are also relatively scarce; fewer than a dozen were uncovered through the same search (e.g., Boruch et al. 1981, Elder et al. 1993, Hyman 1972, Stewart 1984). Ironically, the apparent obscurity of secondary analysis as a methodology is due to its pervasiveness. That is, the use of secondary data sources is so commonplace in many fields (e.g., economics, education, and sociology) that there is little need in calling attention to it as a separable methodology. This makes sense because unlike ethnography, survey research, or quasi-experimentation which each have distinctive methodological procedures and practices, secondary analysis does not involve a new or different set of tactics. Even from a statistical point of view, there is little to distinguish it from primary analysis, and most of the measurement, design, and statistical issues facing the secondary analyst are largely the same as those faced by a primary analyst. The obvious exception is that the secondary analyst is constrained by scope and nature of the design (e.g., the sample, sample sizes, attrition, missing data, measures, and research design) inasmuch as these have been specified by someone else (McCall and Appelbaum 1991). As such, secondary analysis boils down to a data resource, not a methodology, per se. However, the use of secondary data—especially when micro data records are used—does involve a unique set of logistical, ethical and practical considerations.
2. Varieties of Secondary Analysis Because primary data can assume a number of forms (e.g., data based on cross-sectional surveys, administrative records, panel surveys, observations), secondary analysis has taken on a variety of forms. Two 13777
Secondary Analysis: Methodology main categories can be identified: (a) some data are developed for the explicit purpose of serving as a national resource; and (b) other forms of data are byproducts of the actions of individuals and organizations. The latter are included in the definition of secondary analysis because they play a large role in contemporary research. Given the increased use of the Internet, more administrative records, facts, and statistics (public and private) also will be readily available for use in research. 2.1 Traditional Forms of Secondary Analysis The most common forms of secondary data include population censuses, continuous or regular surveys, national cohort studies, multisource data sets, administrative records, and one-time surveys or research studies. In the US, interest in secondary analysis was prompted by the appearance of public opinion polls and surveys in the 1940s and 1950s, and continued in the 1960s, when the federal government began, in earnest, gathering data through large-scale surveys based on representative sampling (Hyman 1972). Of particular interest to social science researchers are the large scale, nationally representative, longitudinal (panel) surveys. In the USA, even cursory literature searches reveal hundreds of publications in economics and sociology that use data drawn from, for example, the Panel Study of Income Dynamics (PSID), the National Longitudinal Survey of Youth (NLSY), and the General Social Survey (GSS). In the United Kingdom, the General Household Survey (GHS) has been conducted annually since the beginning of the 1970s. The reuse of large-scale survey data is also evident across many nations simultaneously, as when Ravillion and Chen (1997) examined the correspondence between the rate of growth in gross domestic product and changes in poverty rates. The willingness of domestic and foreign governments to invest in data collection prompted Cherlin (1991) to speculate that the term secondary analysis will become obsolete. Many large-scale surveys are being sponsored by governments with no primary data analyses in mind. Whereas large-scale surveys like the PSID are deliberately fielded to answer a multitude of research questions about populations and subgroups, substantial data gathering also is undertaken by governments as byproducts of their operation, to monitor their own processes and outcomes, or to assess environmental conditions or events (e.g., weather, rainfall). These are generally referred to as administrative or governmental records. The use of governmental records in research has a long tradition; over 100 years ago, Emile Durkheim used government statistics to examine some of the causes of suicide (Cherlin 1991). Individual research studies also are fodder for secondary analysis. These reanalyzes are undertaken to check the accuracy of statistical conclusions in 13778
primary studies; for testing new statistical procedures or substantive hypotheses; resolving conflicts among researchers; for testing supplemental hypotheses suggested in primary analyses. The latter are usually conducted by the primary analyst, and this category represents the prevalent use of the term ‘secondary analysis.’ In recent years, secondary analysts reanalyzed multiple studies that use the same instrumentation, a tactic more akin to the spirit of metaanalysis. A particularly important role for secondary analysis of individual studies is testing or demonstrating the superiority of new statistical methods. A thoughtful example is provided by Muthen and Curran (1997). They showed that latent growth curve models produced larger intervention effects than had been previously reported by exerting greater control over extraneous sources of error. More generally, in the past 25 years, efforts to address causal questions with extant data have produced substantial advances in statistical modeling (Duncan 1991). And, reanalysis of data from program evaluation studies has been undertaken to assure policy-makers that the results of primary studies are technically sound (Boruch et al. 1981). 2.2 Less Traditional Forms of Secondary Analysis Although not a traditional form of secondary analysis, Webb et al. (1965) demonstrated that a substantial amount of ‘data’ are produced by individuals and organizations (public and private) as byproducts of their daily transactions. In their now classic text Unobtrusie Measures: Nonreactie Research in the Social Sciences, they identified a host of unconventional ways in which researchers can reuse existing data. In making their case, they showed how physical traces (erosion and accretion); the content of running records (e.g., mass media, judicial records), private records (sales records, credit card purchases), and contrived observations (e.g., hidden cameras) can be used as sources of data in research. The literature is filled with studies based on creative uses of these artifacts. Examples include assessing: the type, amount, and ‘quality’ of household garbage to assess the dietary and recycling habits of Arizona residents; the amount of trash left in the streets by revelers at Marti Gras to estimate the size of the daily crowd; and the differential wear and tear seen in separate sections to ascertain the popularity of sections of the International Encyclopedia of the Social Sciences. Finally, whereas secondary analysis has been historically viewed as a quantitative endeavor, reanalysis of qualitative data is now regarded as a legitimate part of the enterprise. It would appear that Webb et al. (1965) had a substantial influence on subsequent generations of primary researchers. The most underacknowledged uses of secondary data are within primary evaluation
Secondary Analysis: Methodology studies, where a mix of new and existing data are increasingly being used to evaluate the effectiveness of social interventions. The use of mixed method evaluation designs represents a core feature of contemporary evaluation theory and practice (Lipsey and Cordray 2000). 2.3 Adantages and Disadantages of Secondary Analysis Secondary analysis offers several advantages over primary data collection in behavioral and social sciences, but it also has its shortcomings. On the positive side of the ledger, re-use of data is efficient. These efficiencies include: (a) replication (or not) of findings across investigators; and (b) the discovery of biases in conventional statistical methods. Both of these benefit science by winnowing false hypotheses more quickly and offering new evidence (estimates) in their place. Testing new hypotheses, beyond those envisioned by the data developers, add to the efficiency of knowledge acquisition. These benefits have served as partial justification for the cost of conducting large scale, nationally representative surveys (using crosssectional and panel designs) that query individuals on a wide array of topics. Some topics involving special populations (e.g., twins) or long time frames cannot be investigated without reliance upon archives and data sharing among investigators. Alternatively, secondary data impose limits on what can be studied, where, and over what period of time. McCall and Appelbaum (1991) suggest using a Feasibility Matrix of SampleiMeasureiAssessment Age as a tool for prescreening the potential utility of secondary data sets. In addition, technical problems (e.g., selection biases, sample attrition, and missing data) can be so severe as to limit the value of the primary data set. As such, assessing data quality probably needs to be included in the McCall– Appelbaum matrix. The existence of data can shortcircuit one facet of the scientific process, tempting the analyst to ‘mine’ the data rather than initiate analyzes based on a theory or hypothesis. Although interesting results may emerge, such practices can lead to unreliable findings or findings limited to a single operationalization. The extent to which these problems in secondary analysis have influenced knowledge development is unknown. But, a mismatch between theory and data can be avoided with proper consideration of the relevance and quality of each data source.
impeded by a number of factors. In particular, because data do not ‘speak for themselves,’ they must be well documented. Data from government-sponsored, large surveys, panels and so on are often routinely archived and well documented. This is not uniform across all types of secondary data. Data sharing among individual investigators can become contentious, especially in light of questions about proprietary rights and data ownership. Researchers are obliged to honor original pledges of confidentiality and privacy. Balancing the desire to share data with these ethical requirements may require configuring the data in a way that may limit how it is disclosed and the methodological options available to the secondary analyst. 3.1 Aailability, Access and Documentation Establishment of data archives, advances in computer technology, and the emergence of the World Wide Web (WWW) have greatly facilitated access to data and solved some of the early problems that plagued the first generation of secondary analysts (e.g., poor documentation, noncommon data formats and language). Since the 1960s, the Interuniversity Consortium for Political and Social Research (ICPSR) has functioned as a major repository and dissemination service in the US. The National Archives have served a similar function for some governmental data. Increasingly, these roles have been devolved to individualgovernmentalagenciesthathavedeveloped skills in storing and disseminating their own data. In addition, the Internet has the potential for revolutionizing the use and transfer of data. Secondary analysis also has been institutionalized within journal and professional codes of conduct. To facilitate access to data appearing in scientific journals, many journals have adopted policies whereby contributing authors are expected to make the data from their studies available for a designated period (usually three to five years) of time. Similarly, professional associations (e.g., the American Psychological Association) have incorporated data sharing into their codes of ethical behavior. Whereas archives and agencies require that data be properly documented, journals and professional associations are generally silent on this aspect of the data sharing process. Because data documentation is a process that records analytic decisions as they unfold over the course of the study, primary study authors need to be aware of these nontrivial editorial and ethical demands (see Boruch et al. 1981).
3. What is Unique About Secondary Analysis? Using data generated by someone else does raise several issues that make secondary analysis somewhat unique. Obviously, secondary analysis is possible only when it is available, easy to access, and in a form that is usable. Availability and access can be facilitated or
3.2 Ethical Considerations and Disclosure The need to protect the confidentiality and privacy of research participants is sometimes at odds with the desire to make data available to others. Resolving these competing values requires careful consideration 13779
Secondary Analysis: Methodology at the time that primary research is conducted, documented, stored, and disseminated. Concealing the identity of participants can often be accomplished by removing personal identifiers from the data file. To the extent that identification is still possible through deductive disclosure (combining information to produce a unique profile of an individual), alternative procedures are needed. If identifiers are needed to link records (as in longitudinal research), additional layers of protection are needed. Fortunately, Boruch and Cecil (1979) provide a comprehensive treatment of the available statistical procedures (e.g., inoculating raw data with known amounts of error, randomized response techniques, collapsing categories) and institutional procedures (e.g., third parties who would serve as ‘data brokers’) that can be used to relax these problems. Data sharing among individual researchers requires explicit attention to ethical, statistical, organizational, and logistical (e.g., video editing and image modification) issues throughout the research process.
3.3 Proprietary Rights and Data Ownership Changing policies, practices, and technology will facilitate data sharing and, as a consequence, increase use of data collected by someone else. Alternatively, secondary analysis of data for the purposes of addressing disputes among analysts can be quite difficult to negotiate. When conflict arises, it often revolves around who ‘owns’ the data and if, when, how and how much of it should be disclosed to others. Establishing proprietary rights to research and evaluation data is not a simple matter. Data ownership depends on how the research was financed, policies of the sponsoring and host organizations, conditions specified in laws (e.g., Freedom of Information Act, Privacy Act), data sharing polices of the journal in which the research is published, and ethical guidelines of professional associations to which authors belong. As researchers embark on primary or secondary analyses, it is necessary to understand these avenues and constraints.
graphy and Registers; Intellectual Property: Legal Aspects; Intellectual Property Rights: Ethical Aspects; International Research: Programs and Databases; Meta-analysis: Overview; Privacy of Individuals in Social Research: Confidentiality; Unobtrusive Measures
Bibliography Boruch R F, Cecil J S 1979 Assuring the Confidentiality of Social Research Data. University of Pennsylvania Press, Philadelphia, PA Boruch R F, Wortman P M, Cordray D S and Associates (eds.) 1981 Reanalyzing Program Ealuations: Policies and Practices for Secondary Analysis of Social and Educational Programs, 1st edn. Jossey-Bass, San Francisco, CA Cherlin A 1991 On analyzing other people’s data. Deelopmental Psychology 27(6): 946–8 Duncan G J 1991 Made in heaven: Secondary data analysis and interdisciplinary collaborators. Deelopmental Psychology 27(6): 949–51 Elder Jr. G H, Pavalko E K, Clipp E C 1993 Working with Archial Data: Studying Lies. Sage Publications, Newbury Park, CA Hyman H H 1972 Secondary Analysis of Sample Sureys: Principles, Procedures and Potentialities. Wiley, London Lipsey M W, Cordray D S 2000 Evaluation methods for social intervention. Annual Reiew of Psychology 51: 345–75 McCall R B, Appelbaum M I 1991 Some issues of conducting secondary analysis. Deelopmental Psychology 27(6): 911–17 Muthen B O, Curran P J 1997 General longitudinal modeling of individual differences in experimental designs: A latent variable framework for analysis and power estimation. Psychological Method 2: 371–402 Ravillion M, Chen S H 1997 What can new survey data tell us about recent changes in distribution and poverty? World Bank Economic Reiew 11: 357–82 Stewart D W 1984 Secondary Research: Information Sources and Methods. Sage, Beverly Hills, CA Webb E J, Campbell D T, Schwartz R D, Sechrest L 1965 Unobtrusie Measures: Nonreactie Research in the Social Sciences. Rand McNally, Chicago, IL
D. S. Cordray
4. A Summary Note Access to quantitative and qualitative data from governments, businesses, and individual researchers has greatly facilitated the practice of secondary analysis. Changes in information technology—notably greater use of the World Wide Web—will undoubtedly enhance these practices even further at primary and secondary levels of research and evaluation. See also: Archives and Historical Databases; Data Archives: International; Databases, Core: Demo13780
Secrecy, Anthropology of A simple definition of secrecy as the deliberate concealment of information has served cross-cultural comparison (e.g., Tefft 1980, Bok 1982). Under this definition, people in cultures everywhere have secrets. Since a secret may turn out to be ‘empty’—no hidden information in fact exists—some anthropologists have suggested that definition be refocused on the practice or the ‘doing’ of secrets rather than on a secret’s
Secrecy, Anthropology of informational content. Anthropologists have been primarily interested in what could be called public, or institutionalized, secrecy: those persistent practices of hiding information within kinship, religious, political, or economic groups. Areas of research have included concealments of ritual knowledge and the social functions of secret societies. Uses of more personal, or private, secrets have also attracted attention insofar as these relate to cultural constructions of personhood and to the dynamics of interpersonal relationships. The study of secrets has posed obvious methodological and ethical problems for ethnographers.
1. The Paradox of Secrecy The revelation of secrets is as important as the keeping of them. Ethnographers have called this the ‘paradox’ of secrecy (Bellman 1984). For a secret to persist within a social order, it must eventually be told to someone who may in turn keep the secret until it passes along again. Furthermore, the social consequence of secrets rests on the fact that people know of their existence even though content remains hidden. Public awareness of the existence of secrets can afford prestige and power to those who know what others do not. Many accounts of secrecy revisit Georg Simmel’s ground-breaking analysis of secrecy and secret societies (1950 [1908]), including his observation that secrets are a form of ‘inner property.’ Simmel suggested that secrets are a type of ‘adornment’—a jewel or bauble that seduces public attention. Secrets with social consequence thus must be at least partly transparent and also periodically transacted and revealed.
2. Public Secrecy Simmel’s commodification opened the door to a political economy of secrecy. If secrets are property, then they are produced, exchanged, and consumed. Secrets are public because they have exchange value within a knowledge marketplace, and because such exchange has political consequence. Systems of political hierarchy may be grounded in an unequal distribution of secrets. Those in the know may dominate those who do not know so long as the ignorant grant that hidden knowledge has value.
2.1 Secret Societies Anthropologists have pursued Simmel’s original concern with secret societies, documenting a variety of secret groups, associations, lodges, and clubs in cultures around the world the members of which are pledged to maintain secrets, ritual and otherwise. The
early comparative category ‘secret society’ was catholic enough to encompass Melanesian men’s houses and grade societies, Australian totemic cults, Native American medicine lodges, West African male and female associations, and more (Webster 1908). Secrecy itself—the one attribute that these diverse organizations had in common—thus enticed anthropological attention (as Simmel predicted). Liberal theories of democratic process and of the capitalist marketplace are suspicious of secrecy as they are of cabals and monopolies. Democracy requires an informed citizenry, and the market is supposed to depend on the free flow of information. From this perspective, secrecy generally goes against the public good. Structural-functionalist analysis, however, argued that secret societies commonly serve important social functions, including the education of children, preservation of political authority, punishment of lawbreakers, stimulation of the economy, and the like (Little 1949). Such ‘social’ secret societies address important community needs, unlike ‘anti-social’ secret societies whose goals are criminal or revolutionary (Wedgewood 1930).
2.2 Secrecy and Power In the latter years of the twentieth century, anthropological attention returned to issues of power and inequality. Neutral definitions of culture as more-orless shared knowledge gave way to new concerns with the contradictions, gaps, variation, and disparity in that knowledge. Secret knowledge and secret societies might have social functions, but they also maintain ruling political and economic regimes. The distribution of public secrets typically parallels that of other rights and powers. Adults hide information from children, often until the latter have been ritually initiated into adulthood. Men refuse to share ritual knowledge with women. Family and lineage groups own various sorts of genealogical, medical, or historical knowledge, sharing these only with kin. Anthropologists have used a political economy of secrets to account for various forms of inequality. For example, the power of Kpelle elders over youth in Liberia was grounded in their management of the secrets of the Poro society (Murphy 1980). Beyond West Africa, anthropologists have argued that older, male keepers of secrets thereby acquired authority over women and the other uninitiated in various societies of Melanesia (Barth 1975), Native America (particularly Southwestern Pueblo cultures), and Australia (Keen 1994). Male appropriation of religious ritual, of access to the supernatural, of technology, of medicine, of sacred texts, of history, or of other valued information cemented men’s authority over the ignorant. 13781
Secrecy, Anthropology of Furthermore, women, children, and others in subordinate political position may be forced to reveal what they know, or otherwise denied rights to secrecy. Rights to have secrets are as consequential as the right to reveal information. Conversely, secrets can be a device to resist power. The dominated conceal what they know from authority. Women share knowledge kept from men. Slaves commune in secret languages. Children construct hidden worlds that evade adult awareness. In this reading, secrecy is a weapon of the weak that functions to resist and deflect supervision from above. Secrecy can also preserve a person’s idiosyncratic beliefs and practices from public deprecation as in the case, for example, of middle-class English witches (Luhrmann 1989). Alongside preservation of regimes of power, secrecy also contributes to perceptions of individual identity. Self-understanding may transform after a person has acquired secret information. Boys, for example, come to redefine themselves as men after progressing through an initiation ritual during which they learn adult secrets (Herdt 1990).
2.3 Rights to Reeal Secrets Many analysts of secrecy systems have concluded that many secrets are not actually secret. Women know men’s tricks: those spirit voices overheard during ritual are really bamboo flutes or bullroarers. Children pretend not to know that their parents are not really Santa Claus. Kin from one lineage are familiar with the supposedly secret names and genealogies of their neighbors. In oral societies, the leakiness of secret knowledge, in fact, helps maintain its viability within a community. If a secret holder dies, others are around to reveal lost knowledge if need be. Systems of secrecy often rely upon inequalities in rights to reveal knowledge rather than on an effective concealment of information—rights that the Kpelle summarize in the term ‘you cannot talk it’ (Bellman 1984). Folk copyrights of this sort regulate who may speak about what (Lindstrom 1990). Even though someone may know the content of concealed information, that person may not repeat this knowledge in public without the copyright to do so. Family groups in Vanuatu, for example, own songs, myths, genealogies, and medical recipes that others may not publicly reveal without that family’s permission. In private contexts, however, public secrets are surreptitiously discussed. Knowledge is regulated not just by restricted transmission—by secrecy—but also by copyrights that limit who speaks publicly. Folk systems of copyright transform secrets into ‘open’ secrets. Those supposed not to know must pretend not to know. And those supposed to know pretend to think that only they, in fact, do know. Anthropological analyzes of the social and psychological dynamics of open secrecy prefigured 13782
work on other similar institutions, including the homosexual ‘closet’ (Sedgwick 1990, see also Taussig 1999 on public secrecy).
3. Personal Secrecy Secrecy becomes possible given particular assumptions about personhood. The person must comprise some sort of inner self where secrets are stored along with an intention and capacity to conceal information. One can imagine a culture where notions of the person lacking such an inner self might deny the possibility of secrecy. No such culture exists although Western conceptions of childhood have often presumed that psychologically immature children lack competence with secrecy—that aptitudes to intentionally conceal information emerge as part of the child developmental process (Bok 1982). Western historians, too, have suggested that there have been different regimes of secrecy in the past, related to shifts in constructions of personhood. Simmel (1950) connected the evolution of secrecy to that of individualization and urbane modernity (his argument recalls similar evolutionary accounts of personal privacy.) In pre-urban societies, lack of individualism and everyday intimacies of contact made secrecy difficult. Simmel supposed that secrets increased with developing opportunities for personal reserve and discretion. More recent historians, stimulated by the work of Michel Foucault (1978) have argued instead that modernity shrinks opportunities for personal secrecy—that bureaucratic power structures increasingly have come to rely upon the monitoring of individuals, either by themselves or by institutional devices that extract information. According to Foucault, the origins of such extraction trace back to the Christian practice of confession. People are obliged to reveal their secrets to ecclesiastical, juridical, and other authorities. Revelation to authority confirms ones subjugation within a social order. The modern individual possesses inner capacities to conceal information but also contradictory urges and duties to reveal those secrets. Stimulated by the work of Erving Goffman (1959) and others, students of interpersonal relationships have remarked the significance of secrets, masks, and ‘face’ in an everyday micropolitics of self-presentation. Such studies have noted more egalitarian uses of revelation. People strengthen their relationships, creating communities of trust, by revealing secret information. This may be a secret about themselves, or a secret about another that they pass along, often in return for the promise ‘not to tell.’ Such secrets are a kind of gossip (Bok 1982), the exchange of which remarks and maintains sentiments of friendship. Personal secrets are a social currency that people invest in
Secular Religions their relationships. Whereas public secrets maintain political value insofar as their transmission is restricted, the value of personal secrets flows from their easy everyday revelation.
4. The Study of Secrecy Anthropology’s interest in cross-cultural description as a whole can be taken to be the desire to learn and reveal other people’s secrets (Taussig 1999). Methodologically, ethnographers face obvious problems of access to concealed information, but the study of secrecy raises even larger ethical puzzles. Anthropological codes of ethics generally require ethnographers to ensure that research does not harm the safety, dignity, or privacy of the people they study. Some researchers have described the structure and function of secrecy systems while avoiding details of secret knowledge content. Others have promised not to make their publications available to those members of a community (women, often) who should not have access to secret information. A few have refrained from publishing at all and restrict access to their fieldnotes. Ethical issues are thorniest where the secrets that anthropologists probe help maintain unjust social orders. See also: Censorship and Secrecy: Legal Perspectives; Emotions, Sociology of; Guilt; Interpersonal Trust across the Lifespan; Knowledge, Anthropology of; Ritual; Trust: Philosophical Aspects; Trust, Sociology of
Bibliography Barth F 1975 Ritual and Knowledge Among the Baktaman of New Guinea. Yale University Press, New Haven, CT Bellman B L 1984 The Language of Secrecy: Symbols & Metaphors in Poro Ritual. Rutgers University Press, New Brunswick, NJ Bok S 1982 Secrets: On the Ethics of Concealment and Reelation, 1st edn. Pantheon Books, New York Foucault M 1978 The History of Sexuality Volume 1: An Introduction. Pantheon Books, New York Goffman E 1959 The Presentation of Self in Eeryday Life. Doubleday, Garden City, NY Herdt G 1990 Secret societies and secret collectives. Oceania 60: 361–81 Keen I 1994 Knowledge and Secrecy in an Aboriginal Religion. Oxford University Press, Oxford, UK Lindstrom L 1990 Knowledge and Power in a South Pacific Society. Smithsonian Institution Press, Washington, DC Little K L 1949 The role of the secret society in cultural specialization. American Anthropologist 51: 199–212 Luhrmann T M 1989 The magic of secrecy. Ethos 17: 131–65 Murphy W P 1980 Secret knowledge as property and power in Kpelle society: Elders versus youth. Africa 50: 193–207 Sedgwick E K 1990 Epistemology of the Closet. University of California Press, Berkeley, CA
Simmel G 1950 [1908] The secret and the secret society. In: Wolff K (ed.) The Sociology of Georg Simmel. Free Press, Glencoe, IL Taussig M 1999 Defacement: Public Secrecy and the Labor of the Negatie. Stanford University Press, Stanford, CA Tefft S K 1980 Secrecy: A Cross-cultural Perspectie. Human Sciences Press, New York Webster H 1908 Primitie Secret Societies: A Study in Early Politics and Religion, 2nd edn. rev. Macmillan, New York Wedgewood C H 1930 The nature and function of secret societies. Oceania 1: 129–45
L. Lindstrom
Secular Religions The term ‘secular religions’ can be used to describe certain apparently secular enterprises that appear to share common features with enterprises usually thought of as ‘religious.’ ‘Secular religion’ is one of several terms—including ‘civil religion,’ ‘invisible religion,’ para-religion, and ‘quasi-religion’—which draw attention to religious and religious-like beliefs and activities which do not fit easily into the Western folk conception of religion as a distinct institutional structure focused on a transcendent being or beings.
1. Examples of ‘Secular Religions’ At first blush, the very notion of ‘secular religion’ would appear to be an oxymoron. In both popular and sociological usage, the term ‘religion’ is typically associated with the realm of the ‘sacred’ and contrasted with that of the ‘secular.’ However, a number of scholars have pointed out striking similarities between ostensibly religious and ostensibly secular undertakings. Secular ideologies it is argued may, like religions, unite followers into a community of shared beliefs. They may provide adherents with a sense of meaning and ultimate purpose. They may inspire in believers a sense of transcendence usually associated with religion. Attempts to find parallels between the sacred and the secular have been especially common in studies of the political realm and in studies of therapeutic organizations and activities.
1.1 Political ‘Religions’ Numerous scholars have highlighted the religious aspects of political movements and ideologies. Communism, for example, has often been regarded as a secular religion. Zuo (1991) has recently described the veneration of Chairman Mao during the Chinese Cultural Revolution as a political religion replete with 13783
Secular Religions sacred beings (Mao himself), sacred texts (the Little Red Book), and ritual (political denunciations). The term ‘political religion’ has also been employed to describe attempts made in developing societies to rally support for the concept of the nation. Crippin (1988) has gone so far as to argue the nationalism is the religion par excellence in modern society and that it is displacing more traditional forms of religion. O’Toole (1977) has used the term ‘sect’ to describe certain political groups operating in Canada, including the Socialist Labor Party, followers of De Leon who wait for a Communist millennium which they regard as imminent. Such social movements as environmentalism, the animal rights movement, and the health food movement have been described as quasireligions insofar as they provide adherents with a coherent worldview and sense of purpose at the same time that they command a great deal of loyalty.
1.2 Therapeutic ‘Religions’ Many scholars have drawn attention to ritual aspects of western medical practice. Others have pointed out that psychiatrists have much in common with shamans and other religious healers. A number of researchers have pointed out the similarities that exist between self-help groups and religious organizations (Galanter 1989, Rudy and Greil 1988). One family within the self-help movement that is perhaps more obviously ‘religious’ in character than most of the others includes Alcoholics Anonymous (AA) and other 12-step or codependency groups. A few among many of the religious characteristics of AA are a conception of the sacred, ceremonies and rituals, creedal statements, experiences of transcendence, and the presence of an AA philosophy of life.
1.3 Other Examples Other examples of attempts to point out analogies between apparently secular enterprises and religious ones abound in the social scientific literature. Within the realm of sport, attention has been paid to ritual elements and experiences of transcendence to be found in cricket, baseball, football, and the Olympic games. In the sphere of business, some writers have drawn attention to the sectarian characteristics of certain types of business, such as home-party sales organizations and direct sales organizations. It has now become commonplace among students of corporate culture to regard such ordinary activities as meetings, presentations and retirement dinners as rituals. Among the voluntary organizations that have been interpreted as quasi-religions are the Boy Scouts, groups of hobbyists and collectors, and ‘fan clubs.’ Jindra (1994) has described the phenomenon of Star Trek ‘fandom’ 13784
as religious in that it provides participants with a common worldview and inspires high levels of commitment. Several writers have attempted to characterize atheism itself as a religious enterprise, characterizing Ethical Culture and other humanist groups as analogous to religions. Still others have viewed faith in science as the dominant religious perspective of the contemporary era.
2. ‘Secular Religions’ and the Problem of Defining Religion It may be argued that an interest in secular religion is a natural outgrowth of the functionalist perspective in classical sociology, which tended to view religion as serving a necessary function for the maintenance of society by uniting members of a society into a common moral universe. If, as many early sociologists thought, supernaturalistic conceptions of the universe were destined to recede in the face of industrialization and the increasing influence of science, the question arose as to what phenomena might serve as the ‘social cement’ of future societies. Comte’s call for a ‘religion of humanity’ qualifies as an early sociological mention of the notion of a ‘secular religion.’ Durkheim’s expectation that a ‘cult of the individual’ might play a similar role in the maintenance of society as that traditionally played by religions represents another early effort to provide intellectual justification for the idea that apparently secular ideologies may have religious characteristics. As appealing and intuitive as the idea that secular enterprises may share important features with religions might be, the notion of a ‘secular religion’ must confront theoretical problems concerning the appropriate sociological definition of ‘religion’ and of ‘the sacred.’ Functional definitions of religion emphasize that the essential element in religion is the provision of an encompassing system of meaning or the ability to connect people to the ultimate conditions of their existence. Substantive definitions of religion argue that what distinguishes religion from other types of human activity is its reference to the sacred, the supernatural, or the superempirical. The advantage of functional definitions is a breadth and inclusiveness that allows one to look at beliefs and practices not commonly referred to as religious but which may nonetheless resemble religious phenomena in important ways. One major disadvantage is that they may have the effect of so broadening the concept of religion that it becomes meaningless. While substantive definitions avoid this problem, they may result in allowing traditional Western conceptions of the nature of religion to determine what is to be included in the ‘religious’ category. Viewed from the perspective of the debate over the definition of religion, the theoretical problem with the concept of ‘secular
Secular Religions religion’ is that it seems to rely simultaneously on both a broad functional approach to religion and on a narrower substantive approach. The idea that environmentalism or nationalism could be properly called a religion relies on a functional definition, while the idea that such religions are ‘secular’ rather than ‘sacred’ necessarily depends on a substantive definition. One might argue that both functional and substantive definitions of religion share the weakness that they privilege social scientists’ conceptions of religion over folk conceptions of religion, that is to say the ways that people use the term ‘religion’ in everyday life. Some social scientists would therefore argue that the search for a scientifically valid definition of religion is futile and that the best that can be done is to define religion ethnographically, treating it as a ‘category of discourse’ with meanings that are changeable over time. The position that scholars take with regard to these definitional issues will necessarily influence the way in which they approach the study of ‘secular religions.’ This is perhaps one reason why there is at present no consensus within the social sciences with regard to the most appropriate method for studying phenomena residing on the border between the sacred and the secular.
3. Approaches to the Study of ‘Secular Religions’ Within contemporary sociology and anthropology, there exist several different research traditions that focus on the boundary between the religious and the non-religious. 3.1 The Notion of Ciil Religion While Rousseau coined the term ‘civil religion,’ its development as a social scientific concept is attributed to Robert Bellah (1967). Working within the functionalist tradition, which sees religion as integrative for society, Bellah argued for the existence of a US civil religion, an ‘institutionalized collection of sacred beliefs about the American nation,’ which binds US citizens together in spite of denominational pluralism. A key tenet of US civil religion is the conception of the USA as a nation with a divinely ordained mission. Although the civil religion concept was developed in the US context, it has been applied to the analysis of many states including Canada, Israel, and Malaysia (see Christianity: Eangelical, Reialist, and Pentecostal). 3.2 The Implicit Religion Tradition Although the term ‘implicit religion’ was coined and popularized by Edward Bailey (1983), the concept owes much to the work of Thomas Luckmann (1967).
Working with a broad functional definition of religion as the transcending of biological nature and the formation of a self, Luckmann argues that religion is a human universal. Luckmann maintains that traditional religions have become irrelevant in the modern world but that religion, rather than disappearing, has become personalized, privatized, and ‘invisible’ (see Religiosity: Modern). Following Luckmann’s lead, Bailey’s implicit religion approach looks for the experience of the sacred within events of everyday life ordinarily dismissed as profane. Thus, in a study of interaction in an English public house, Bailey interprets the ethos of ‘being a man,’ mastering alcohol and respecting the selves of others as implicitly religious. 3.3 The Study of ‘Religious’ Forms There also exist a relatively large number of studies of general social ‘forms’ which are deemed to be relevant in both sacred and secular contexts. General discussions of ‘secular ritual,’ for example, fall into this category. Goffman’s (1967) work on the functional significance of such rituals of everyday life as ‘saving face’ and showing ‘deference’ is particularly well known. Collins (1981) treats ‘interaction ritual chains’ as a key element in his attempt to lay a microsociological foundation for macro-sociology. Once conversations, social greetings, and athletic contests are allowed to count as ritual, ritual becomes virtually coterminous with social life. For this reason, some scholars reject ‘ritual’ as a meaningless category, while others argue that the ubiquity of ritual simply means that ritualizing is a fundamental human activity. Many sociological studies of the commitment process either explicitly or implicitly present the commitment process as operating in much the same way in both sacred and secular contexts (Kanter 1972). A number of writers have developed models of the identity change process which highlight the similarity between religious conversion and other forms of identity change (Galanter 1989, Greil and Rudy 1983). Studies that employ the term ‘sect’ in the analysis of apparently secular organizations have already been discussed. 3.4 The ‘Quasi-religion’ Approach The quasi-religion approach popularized by Greil (1993) and his colleagues relies on an ethnographic approach to the term ‘religion,’ regarding it, not as an objective category susceptible to social scientific definition, but as a claim negotiated in the course of social interaction. Greil distinguishes between parareligions and quasi-religions. Para-religions, which are ostensibly nonreligious entities that nonetheless deal with matters of ultimate concern, resemble the enterprises referred to in this article as ‘secular religions.’ 13785
Secular Religions The term ‘quasi-religion,’ on the other hand, refers to groups, like AA or certain occult groups, which deal with matters of transcendence or ultimate concern, but which do not see themselves or present themselves unambiguously as religious. The concern here is not to determine whether a particular group is or is not a quasi-religion but to examine the process by which particular groups attempt to claim or repudiate the religious label and by which other groups and individuals respond to these claims.
4. The Theoretical Significance of Secular Religions Pursuing the analogy between ostensibly secular enterprises and ‘religion’ as it is usually conceived raises important questions concerning the proper definition of religion, the process of secularization and the nature of religion in contemporary societies. The study of secular religions puts into bold relief the problematic nature of social scientific attempts to define religion. How one defines religion has important consequences for how one thinks about secular religions and for whether or not one regards the concept as useful. Attention to religious border phenomena presses one to consider the value of an ethnographic approach to definition that conceptualizes religion as a category of discourse whose precise meaning and implications are continually being negotiated in the course of social interaction. The study of secular religions also directs attention to difficulties in specifying and evaluating the secularization thesis, which claims that the influence of religion is declining in contemporary societies (see Secularization). It should be clear that one’s approach to the question of secularization (including whether one even conceives of secularization as a theoretical possibility) depends on one’s definition of religion and on one’s treatment of religious border phenomena. Several of the approaches to secular religion discussed here imply that traditional Western conceptions of religion as an institutionalized set of beliefs and practices focusing on a transcendent deity is beginning to lose its hold over many people. Likewise, many people seem to be tending to give expression to their experiences of transcendence in institutional contexts that have not typically been thought of as religious. If this is, in fact, the case, the line between religion and nonreligion may be expected to become even more blurred and the study of religious border phenomena even more central to the social scientific enterprise. See also: Communism; Laicization, History of; Nationalism, Sociology of; New Religious Movements; Religion, Sociology of; Religiosity, Sociology of 13786
Bibliography Bailey E 1983 The implicit religion of contemporary society: An orientation and plea for its study. Religion 13: 69–83 Bellah R N 1967 Civil religion in America. Daedalus 96: 1–21 Collins R 1981 On the micro-foundations of macro-sociology. American Journal of Sociology. 86: 984–1014 Crippin T 1988 Old and new gods in the modern world: Toward a theory of religious transformation. Social Forces 67: 316–36 Galanter M 1989 Cults: Faith, Healing, and Coercion. Oxford University Press, New York Goffman E 1967 Interaction Ritual: Essays on Face-to-Face Behaior 1st edn. Anchor Books, Garden City, NY Greil A L 1993 Explorations along the sacred frontier: Notes on para-religions, quasi-religions, and other boundary phenomena. In: Bromley D, Hadden J K (eds.) Handbook of Cults and Sects in America: Assessing Two Decades of Research and Theory Deelopment, (Volume 2 of Religion and the Social Order). JAI Press, Greenwich, CT pp. 153–72 Greil A L, Rudy D R 1983 Conversion to the world view of Alcoholics Anonymous: A refinement of conversion theory. Qualitatie Sociology 6: 5–28 Jindra M 1994 Star trek fandom as a religious phenomenon. Sociology of Religion 55: 27–51 Kanter R M 1972 Commitment and Community: Communes and Utopias in Sociological Perspectie. Harvard University Press, Cambridge, MA Luckmann T 1967 The Inisible Religion: The Problem of Religion in Modern Society. Macmillan, New York O’Toole R 1977 The Precipitous Path: Studies in Political Sects. Peter Martin, Toronto, Canada Rudy D R, Greil A L 1988 Is Alcoholics Anonymous a religious organization? Sociological Analysis 50: 41–51 Zuo J P 1991 Political religion: The case of the cultural revolution in China. Sociological Analysis 52: 99–110
A. L. Greil
Secularization In its precise historical sense, ‘secularization’ refers to the transfer of persons, things, meanings, etc., from ecclesiastical or religious to civil or lay use. In its broadest sense, often postulated as a universal developmental process, secularization refers to the progressive decline of religious beliefs, practices, and institutions.
1. The Term ‘Secularization’ Etymologically, the term secularization derives from the medieval Latin word saeculum, with its dual temporal-spatial connotation of secular age and secular world. Such a semantic connotation points to the fact that social reality in medieval Christendom was structured through a system of classification which
Secularization divided ‘this world’ into two heterogeneous realms or spheres, ‘the religious’ and ‘the secular.’ This was a particular, and historically rather unique, variant of the kind of universal dualist system of classification of social reality into sacred and profane realms, postulated by E; mile Durkheim. In fact, Western European Christendom was structured through a double dualist system of classification. There was, on the one hand, the dualism between ‘this world’ (the City of Man) and ‘the other world’ (the City of God). There was, on the other hand, the dualism within ‘this world’ between a ‘religious’ and a ‘secular’ sphere. Both dualisms were mediated, moreover, by the sacramental nature of the church, situated in the middle, simultaneously belonging to both worlds and, therefore, able to mediate sacramentally between the two. The differentiation between the cloistered regular clergy and the secular clergy living in the world was one of the many manifestations of this dualism. The term secularization was first used in canon law to refer to the process whereby a religious monk left the cloister to return to the world and thus become a secular priest. In reference to an actual historical process, however, the term secularization was first used to signify the lay expropriation of monasteries, landholdings, and the mortmain wealth of the church after the Protestant Reformation. Thereafter, secularization has come to designate any transfer from religious or ecclesiastical to civil or lay use.
1.1 Secularization as a Historical Process Secularization refers to the historical process whereby the dualist system within ‘this world’ and the sacramental structures of mediation between this world and the other world progressively break down until the medieval system of classification disappears. Max Weber’s expressive image of the breaking of the monastery walls remains perhaps the best graphic expression of this radical spatial restructuration initiated by the Protestant Reformation. This process, which Weber conceptualized as a general reorientation of religion from an other-worldly to an inner-worldly direction, is literally a process of secularization. Religious ‘callings’ are redirected to the saeculum. Salvation and religious perfection are no longer to be found in withdrawal from the world but in the midst of worldly secular activities. In fact, the symbolic wall separating the religious and the secular realms breaks down. The separation between ‘this world’ and ‘the other world,’ for the time being at least, remains. But, from now on, there will be only one single ‘this world,’ the secular one, within which religion will have to find its own place. To study what new systems of classification and differentiation emerge within this one secular world and what new place religion will have within it
is precisely the analytical task of the theory of secularization. Obviously, such a concept of secularization refers to a particular historical process of transformation of Western European Christian societies and might not be directly applicable to other non-Christian societies with very different modes of structuration of the sacred and profane realms. It could hardly be applicable, for instance, to such ‘religions’ as Confucianism or Taoism, insofar as they are not characterized by high tension with ‘the world’ and have no ecclesiastical organization. In a sense those religions which have always been ‘worldly’ and ‘lay’ do not need to undergo a process of secularization. In itself such a spatial-structural concept of secularization describes only changes in the location of Christian religion from medieval to modern societies. It tells very little, however, about the extent and character of the religious beliefs, practices, and experiences of individuals and groups living in such societies. Yet the theory of secularization adopted by the modern social sciences incorporated the beliefs in progress and the critiques of religion of the Enlightenment and of Positivism and assumed that the historical process of secularization entailed the progressive decline of religion in the modern world. Thus, the theory of secularization became embedded in a philosophy of history that saw history as the progressive evolution of humanity from superstition to reason, from belief to unbelief, from religion to science.
2. The Secularization Paradigm Secularization might have been the only theory within the social sciences that was able to attain the status of a paradigm. In one form or another, with the possible exception of Alexis de Tocqueville, Vilfredo Pareto, and William James, the thesis of secularization was shared by all the founders. Paradoxically, the consensus was such that for over a century the theory of secularization remained not only uncontested but also untested. Even Durkheim’s and Weber’s work, while serving as the foundation for later theories of secularization, offer scant empirical analysis of modern processes of secularization, particularly of the way in which those processes affect the place, nature and role of religion in the modern world. Even after freeing themselves from some of the rationalist and positivist prejudices about religion, they still share the major intellectual assumptions of the age about the future of religion. For Durkheim, the old gods were growing old or already dead and the dysfunctional historical religions would not be able to compete with the new functional gods and secular moralities which modern societies were bound to generate. For Weber, the process of 13787
Secularization intellectual rationalization, carried to its culmination by modern science, had ended in the complete disenchantment of the world, while the functional differentiation of the secular spheres had displaced the old integrative monotheism, replacing it with the modern polytheism of values and their unceasing and irreconcilable struggle. The old churches remain only as a refuge for those ‘who cannot bear the fate of the times like a man’ and are willing to make the inevitable ‘intellectual sacrifice.’ Only in the 1960s one finds the first attempts to develop more systematic and empirically grounded formulations of the theory of secularization in the works of Acquaviva (1961), Berger (1967), Luckmann (1963), and Wilson (1966). It was then, at the very moment when theologians were celebrating the death of God and the secular city, that the first flaws in the theory became noticeable and the first systematic critiques were raised by Martin (1969) and Greeley (1972) in what constituted the first secularization debate. For the first time it became possible to separate the theory of secularization from its ideological origins in the Enlightenment critique of religion and to distinguish the theory of secularization, as a theory of the modern autonomous differentiation of the secular and the religious spheres, from the thesis that the end result of the process of modern differentiation would be the progressive erosion, decline and eventual disappearance of religion. Greeley (1972) already pointed out that the secularization of society, which he conceded, by no means implied the end of church religiosity, the emergence of ‘secular man,’ or the social irrelevance of religion in modern secular societies. Yet after three decades the secularization debate remains unabated. Defenders of the theory tend to point to the secularization of society and to the decline of church religiosity in Europe as substantiating evidence, while critics tend to emphasize the persistent religiosity in the United States and widespread signs of religious revival as damaging counterevidence that justify discarding the whole theory as a ‘myth.’
3. The Three Subtheses of the Theory of Secularization Although it is often viewed as a single unified theory, the paradigm of secularization is actually made up of three different and disparate propositions: secularization as differentiation of the secular spheres from religious institutions and norms, secularization as general decline of religious beliefs and practices, and secularization as privatization or marginalization of religion to a privatized sphere. Strictly speaking, the core and central thesis of the theory of secularization is the conceptualization of the historical process of societal modernization as a process of functional differentiation and emancipation of the secular 13788
spheres—primarily the state, the economy, and science—from religion and the concomitant specialized and functional differentiation of religion within its own newly found religious sphere. The other subtheses, the decline and privatization of religion, were added as allegedly necessary structural consequences of the process of secularization. Maintaining this analytical distinction should allow the examination and testing of the validity of each of the three propositions independently of each other and thus refocus the often fruitless secularization debate into comparative historical analysis that could account for obviously different patterns of secularization.
3.1 The Differentiation and Secularization of Society The medieval dichotomous classification of reality into religious and secular realms was to a large extent dictated by the church. In this sense, the official perspective from which medieval societies saw themselves was a religious one. Everything within the saeculum remained an undifferentiated whole as long as it was viewed from the outside, from the perspective of the religious. The fall of the religious walls put an end to this dichotomous way of thinking and opened up an entire new space for processes of internal differentiation of the various secular spheres. Now, for the first time, the various secular spheres—politics, economics, law, science, art, etc.—could come fully into their own, become differentiated from each other, and follow what Weber called their ‘internal and lawful autonomy.’ The religious sphere, in turn, became a less central and spatially diminished sphere within the new secular system, but also a more internally differentiated one, specializing in ‘its own religious’ function and losing many other ‘nonreligious’ (clerical, educational, social welfare) functions (Luhmann 1977). The loss in functions entailed as well a significant loss in hegemony and power. It is unnecessary to enter into the controversial search for first causes setting the modern process of differentiation into motion. It suffices to stress the role which four related and parallel developments played in undermining the medieval religious system of classification: the Protestant Reformation; the formation of modern states; the growth of modern capitalism; and the early modern scientific revolution. Each of the four developments contributed its own dynamic to modern processes of secularization. The four of them together were certainly more than sufficient to carry the process through. The Protestant Reformation by undermining the universalist claims of the Roman Catholic church helped to destroy the old organic system of Western Christendom and to liberate the secular spheres from religious control. Protestantism also served to legit-
Secularization imate the rise of bourgeois man and of the new entrepreneurial classes, the rise of the modern sovereign state against the universal Christian monarchy, and the triumph of the new science against Catholic scholasticism. Moreover, Protestantism may also be viewed as a form of internal secularization, as the vehicle through which Christian religious contents were to assume institutionalized secular forms in modern societies, thereby erasing the religious\secular divide. If the universalist claims of the church as a salvation organization were undermined by the religious pluralism introduced by the Reformation, its monopolist compulsory character was undermined by the rise of a modern secular state which progressively was able to concentrate and monopolize the means of violence and coercion within its territory. In the early absolutist phase the alliance of throne and altar became even more accentuated. The churches attempted to reproduce the model of Christendom at the national level, but all the territorial national churches, Anglican, Lutheran, Catholic, and Orthodox, fell under the caesaro–papist control of the absolutist state. As the political costs of enforcing conformity became too high, the principle cuius regio eius religio turned into the principle of religious tolerance and state neutrality towards privatized religion, the liberal state’s preferred form of religion. Eventually, new secular raison d’eT tat principles led to the constitutional separation of church and state, even though some countries such as England and the Scandinavian Lutheran countries may have maintained formal establishment. Capitalism, that revolutionizing force in history which according to Marx ‘melts all that is solid into air and profanes all that is holy,’ had already sprouted within the womb of the old Christian society in the medieval towns. No other sphere of the saeculum would prove more secular and more unsusceptible to religious and moral regulation than the capitalist market. Nowhere is the transvaluation of values which takes place from Medieval to Puritan Christianity as radical and as evident as in the change of attitude towards ‘charity’—that most Christian of virtues— and towards poverty. Following Weber (1958), one could distinguish three phases and meanings of capitalist secularization: in the Puritan phase, ‘asceticism was carried out of monastic cells into everyday life’ and secular economic activities acquired the meaning and compulsion of a religious calling; in the utilitarian phase, as the religious roots dried out, the irrational compulsion turned into ‘sober economic virtue’ and ‘utilitarian worldlines’; finally, once capitalism ‘rests on mechanical foundations,’ it no longer needs religious or moral support and begins to penetrate and colonize the religious sphere itself, subjecting it to the logic of commodification (Berger 1967). The conflict between the church and the new science, symbolized by the trial of Galileo, was not as much about the substantive truth or falsity of the new
Copernican theory as it was about the validity of the claims of the new science to have discovered a new autonomous method of obtaining and verifying truth. The conflict was above all about science’s claims to differentiated autonomy from religion. The Newtonian Enlightenment established a new synthesis between faith and reason, which in Anglo-Saxon countries was to last until the Darwinian crisis of the second half of the nineteenth-century. Across the Channel, however, the Enlightenment became patently radicalized and militantly anti-religious. Science was transformed into a scientific and scientistic worldview which claimed to have replaced religion the way a new scientific paradigm replaces an outmoded one. As each of these carriers—Protestantism, the modern state, modern capitalism, and modern science— developed different dynamics in different places and at different times, the patterns and the outcomes of the historical process of secularization varied accordingly. Yet it is striking how few comparative historical studies of secularization there are which would take these four, or other, variables into account. David Martin’s A General Theory of Secularization is perhaps the single prominent exception. Only when it comes to capitalism has it been nearly universally recognized that economic development affects the ‘rates of secularization,’ that is, the extent and relative distribution of religious beliefs and practices. This positive insight, however, turns into a blinder, when it is made into the sole variable accounting for different rates of secularization. As a result, those cases in which no positive correlation is found, as expected, between rates of secularization and rates of industrialization, urbanization, proletarianization, education, in short, with indicators of socio-economic development, are classified as ‘exceptions’ to the ‘rule.’ 3.2 The Decline of Religion Thesis The assumption, often stated but mostly unstated, that religion in the modern world was declining and would likely continue to decline has been until recently a dominant postulate of the theory of secularization. It was based primarily on evidence from European societies showing that the closer people were involved in industrial production, the less religious they became, or at least, the less they took part in institutional church religiosity. The theory assumed that the European trends were universal and that non-European societies would evince similar rates of religious decline with increasing industrialization. It is this part of the theory which has proven patently wrong. One should keep in mind the inherent difficulties in comparative studies of religion. Globally, the evidence is insufficient and very uneven. There is often no consensus as to what counts as religion and even when there is agreement on the object of study, there is likely to be disagreement on which of the dimensions of religiosity (membership affiliation, official vs. popular 13789
Secularization religion, beliefs, ritual and nonritual practices, experiences, doctrinal knowledge, and their behavioral and ethical effects) one should measure and how various dimensions should be ranked and compared. Nevertheless, one can prudently state that since World War II, despite rapid increases in industrialization, urbanization, and education, most religious traditions in most parts of the world have either experienced some growth or maintained their vitality (Whaling 1987). The main exceptions were: the rapid decline of primal religions mostly in exchange for more ‘advanced’ ones (mainly Islam and Christianity), the sudden and dramatic decline of religion in communist countries, a process which is now being reversed after the fall of communism, and the continuous decline of religion throughout much of Western Europe (and, one could add, some of its colonial outposts such as Argentina, Uruguay, New Zealand, and Quebec). What remains, therefore, as significant and overwhelming evidence is the progressive and apparently still continuing decline of religion in Western Europe. It is this evidence which has always served as the empirical basis for most theories of secularization. Were it not for the fact that religion shows no uniform sign of decline in Japan or the United States, two equally modern societies, one could still perhaps maintain the ‘modernizing’ developmentalist assumption that it is only a matter of time before the more ‘backward’ societies catch-up with the more ‘modern’ ones. But such an assumption is no longer tenable. Leaving aside the evidence from Japan, a case which should be crucial, however, for any attempt to develop a general theory of secularization, there is still the need to explain the obviously contrasting religious trends in Western Europe and the United States. Until very recently, most of the comparative observations as well as attempts at explanation came from the European side. European visitors have always been struck by the vitality of American religion. The United States appeared simultaneously as the land of ‘perfect disestablishment’ and as ‘the land of religiosity par excellence.’ Yet, Europeans rarely felt compelled to put into question the thesis of the general decline of religion in view of the American counterevidence. Religious decline was so much taken for granted that what required an explanation was the American ‘deviation’ from the European ‘norm.’ The standard explanations have been either the expedient appeal to ‘American exceptionalism’ or the casuistic strategy to rule out the American evidence as irrelevant, because American religion was supposed to have become so ‘secular’ that it should no longer count as religion (Luckmann 1963). From a global–historical perspective, however, it is the dramatic decline of religion in Europe that truly demands an explanation. A plausible answer would require a search for independent variables, for those independent carriers of secularization present in 13790
Europe but absent in the United States. Looking at the four historical carriers mentioned above, neither Protestantism nor capitalism would appear as plausible candidates. The state and scientific culture, however, could serve as plausible independent variables, since church–state relations and the scientific worldviews carried by the Enlightenment were significantly different in Europe and America. What the United States never had was an absolutist state and its ecclesiastical counterpart, a caesaro– papist state church. It was the caesaro–papist embrace of throne and altar under absolutism that perhaps more than anything else determined the decline of church religion in Europe. Consistently throughout Europe, nonestablished churches and sects in most countries have been able to withstand the secularizing trends better than the established church. It was the very attempt to preserve and prolong Christendom in every state and thus to resist modern functional differentiation that nearly destroyed the churches in Europe. The Enlightenment and its critique of religion became themselves independent carriers of processes of secularization wherever the established churches became obstacles to the modern process of functional differentiation or resisted the emancipation of the cognitive–scientific, political–practical, or aesthetic– expressive secular spheres from religious and ecclesiastical tutelage. In such cases, the Enlightenment critique of religion was usually adopted by social movements and political parties, becoming in the process a selffulfilling prophecy. By contrast, wherever religion itself accepted, perhaps even furthered, the functional differentiation of the secular spheres from the religious sphere, the radical Enlightenment and its critique of religion became superfluous. Simply put, the more religions resist the process of modern differentiation, that is, secularization in the first sense, the more they will tend in the long run to suffer religious decline, that is, secularization in the second sense. 3.3 The Priatization of Religion Thesis As a corollary of the thesis of differentiation, religious disestablishment entails the privatization of religion. Religion becomes indeed ‘a private affair.’ Insofar as freedom of conscience, ‘the first freedom’ as well as the precondition of all modern freedoms, is related intrinsically to ‘the right to privacy’—to the modern institutionalization of a private sphere free from governmental intrusion as well as free from ecclesiastical control—and inasmuch as the right to privacy serves as the very foundation of modern liberalism and of modern individualism, then indeed the privatization of religion is essential to modern societies. There is, however, another more radical version of the thesis of privatization which often appears as a corollary of the decline of religion thesis. In modern secular societies, whatever residual religion remains
Segregation Indices becomes so subjective and privatized that it turns ‘invisible,’ that is, marginal and irrelevant from a societal point of view. Not only are traditional religious institutions becoming increasingly irrelevant but, Luckmann (1963) adds, modern religion itself is no longer to be found inside the churches. The modern quest for salvation and personal meaning has withdrawn to the private sphere of the self. Significant for the structure of modern secular societies is the fact that this quest for subjective meaning is a strictly personal affair. The primary ‘public’ institutions (state, economy) no longer need or are interested in maintaining a sacred cosmos or a public religious worldview. The functionalist thesis of privatization turns problematic when instead of being a falsifiable empirical theory of dominant historical trends, it becomes a prescriptive normative theory of how religious institutions ought to behave in the modern world. Unlike secular differentiation, which remains a structural trend that serves to define the very structure of modernity, the privatization of religion is a historical option, a ‘preferred option’ to be sure, but an option nonetheless. Privatization is preferred internally as evinced by general pietistic trends, by processes of religious individuation, and by the reflexive nature of modern religion. Privatization is constrained externally by structural trends of differentiation which force religion into a circumscribed and differentiated religious sphere. Above all, privatization is mandated ideologically by liberal categories of thought which permeate modern political and constitutional theories. The theory of secularization should be free from such a liberal ideological bias and admit that there may be legitimate forms of ‘public’ religion in the modern world, which are not necessarily anti-modern fundamentalist reactions and which do not need to endanger either modern individual freedoms or modern differentiated structures. Many of the recent critiques and revisions of the paradigm of secularization derive from the fact that in the 1980s religions throughout the world thrust themselves unexpectedly into the public arena of moral and political contestation, demonstrating that religions in the modern secular world continue and will likely continue to have a public dimension. See also: Religion: Evolution and Development; Religion, Phenomenology of; Religion, Sociology of; Secular Religions
Bibliography Acquaviva S S 1961 L’ecclissi del sacro nella ciiltaZ industrialle. Edizioni di Communita' , Milan [1979 The Decline of the Sacred in Industrial Society. Blackwell, Oxford] Berger P 1967 The Sacred Canopy. Doubleday, Garden City, NY Casanova J 1994 Public Religions in the Modern World. University of Chicago Press, Chicago Dobbelaere K 1981 Secularization: A Multidimensional Concept. Sage, Beverly Hills, CA
Greeley A 1972 Unsecular Man: The Persistence of Religion. Schocken, New York Greeley A 1989 Religious Change in America. Harvard University Press, Cambridge, MA Hadden J K, Shupe A (eds.) 1989 Secularization and Fundamentalism Reconsidered. Paragon House, New York Luckmann T 1963 Das Problem der Religion in der modernen Gesellschaft. Rombach, Freiburg [1967 Inisible Religion. Macmillan, New York] Luhmann N 1977 Funktion der Religion. Suhrkamp Verlag, Frankfurt [1984 Religious Dogmatics and the Eolution of Societies. Mellen Press, New York] Martin D 1969 The Religious and the Secular. Shocken, New York Martin D 1978 A General Theory of Secularization. Blackwell, Oxford, UK Stark R, Bainbridge W S 1985 The Future of Religion. University of California Press, Berkeley, CA Weber M 1958 The protestant ethic and the spirit of capitalism. Scribner’s Sons, New York Whaling F (ed.) 1987 Religion in Today’s World: The Religious Situation of the World from 1945 to the Present Day. T & T Clark, Edinburgh, UK Wilson B 1966 Religion in Secular Society. Watts, London Wilson B 1985 Secularization: The inherited model. In: Hammond P (ed.) The Sacred in a Secular Age. University of California Press, Berkeley, CA
J. Casanova
Segregation Indices Measuring the residential segregation of racial and ethnic populations became very important to social scientists during the civil rights movement in the United States (Duncan and Duncan 1955; Taueber and Taueber 1965). Until the mid-1970s, when important critiques of the dissimilarity index stimulated efforts to develop alternatives, research almost exclusively relied upon this measure, and focused on the residential segregation of blacks from whites in the United States. Despite the many alternatives proposed, only the isolation index has really joined the dissimilarity index as a measure frequently used in empirical research on residential segregation. However, applications have extended substantially, to other racial and ethnic populations, and also to groups within and across races, defined by income, poverty, nativity, or other characteristics. Virtually all segregation indices are constructed by aggregating the observed distributions of the ‘minority’ and referent populations across a metropolitan area or city divided into small geographic areas—usually census tracts or census blocks in the United States. These geographic units are sufficiently small to be taken as rough approximations of neighborhoods whose residents would acknowledge and share some sense of common spatial identity, and hence some sense of community. Massey and Denton’s (1988) review and analysis of 13791
Segregation Indices 20 indices that either had been or could be proposed for measuring distinct aspects of residential segregation represents the major watershed in the field. Massey and Denton suggested that the 20 indices measured five distinct dimensions of residential segregation: (a) Evenness: indices that measure how the observed areal distributions of the minority and majority group populations in a metropolitan area deviate from distributions representing various definitions and criteria for equality or randomness. (b) Exposure: indices based on the probability that the next person a minority group member encounters in their neighborhood or geographic area will also be a member of that group, or will instead be a member of the referent group. (c) Concentration: indices that measure the degree to which a minority population is concentrated in a relatively small number of compact tracts. (d) Centralization: indices that measure the degree to which a minority population resides close to the center or ‘downtown’ areas of a city. (e) Clustering: indices that measure the degree to which the minority population disproportionately resides in neighborhoods or geographic units that adjoin one another, and hence form clusters or enclaves.
1. Measures of Eenness The index of dissimilarity remains by far the most frequently used in empirical research on residual segregation. It is built on the premise that a metropolitan area or city would be integrated completely if all of its census tracts (or other areal units) had exactly the proportion of minority group and majority group residents as the metropolitan area or city as a whole. The index is calculated as the average deviation of the tracts or areal units from this proportionate representation, where the deviation is tract i is QpikPQ. Thus,
0
n
15
D l [ti Q pikP Q ] [2TP(1kP)] " The resulting dissimilarity score can be interpreted as representing the percentage of the metropolitan area’s minority residents who would have to move from tracts or areal units where the minority group is overrepresented to areal units where they are underrepresented to achieve proportionate representation or complete integration. The primary criticism of the dissimilarity index is that it is insensitive to moves (or ‘transfers’) of minority group members among tracts where they are over-represented or under-represented. Several indices developed in literatures measuring income inequalities overcome this violation of the ‘transfers principle.’ 13792
The Gini coefficient averages the differences in the proportions of the minority group in every pair of tracts or areal units in the metropolitan area, where QpikpjQ represents the difference in these proportions for tracts i and j
0
n
15[2T #P(1kP)]
n
G l [titj Q pikpj Q ] i=" j="
The Atkinson measures allow researchers to develop indices that weight tracts where the minority group is under-represented ( pi P) and over-represented ( pi P) equally (b l 0.5), or to weight underrepresentation (0 b 0.5) or over-representation (0.5 b 1.0) more heavily
)
0
n
A l [P\(1kP)] (1\PT ) (1kpi)("−b)pbiti "
1)" "
/( −b)
The theoretical or conceptual value of the Atkinson indices lies in applications where one might be more interested in reducing under-representation (e.g., integrating ‘lily-white’ suburbs) or in reducing overrepresentation (e.g., desegregating all black or Hispanic ghettos or barrios). Entropy or information indices are based on the concept that entropy is maximized and information minimized when distributions are in equilibrium. In segregation applications, this would occur when the proportions of the minority and majority group populations were equal (50 percent for two groups) in a metropolitan area. The entropy index measures the percentage of the metropolitan area’s entropy that is represented by the average deviation of entropy in each tract or areal unit from that in the metropolitan area as a whole. Thus, where entropy in each tract i and in the metropolitan area or city are represented respectively, by Ei l pi log (1\pi)j(1kpi) log (1\(1kpi)) and E l P log (1\P)j(1kP) log (1\(1kP)) then
0
n
H l [ti(EkEi)] "
15ET
2. Measures of Exposure Scores on the entropy index depend on P, the proportion that the minority group represents of the metropolitan area or city population. This violates a
Segregation Indices criterion that proposes the segregation indices should have ‘compositional invariance,’ that is, independent of the proportions of minority and majority group members in different cities or metropolitan areas. However, in many critical applications such as school desegregation, the very low percentages of minority or majority group students in a district constitute major barriers to increasing racial integration or balance. The conceptual appropriateness and theoretical value of indices that are influenced by the relative proportions of minority and majority group members is thus undeniable, and they are among the few measures besides the dissimilarity index that are used extensively in empirical research. The interaction and isolation measures are designed specifically to measure segregation defined by the degree to which the relative sizes of the minority and majority populations, as well as their distributions across neighborhoods or areal units, affect the chances of their availability to interact with one another in those neighborhoods. The interaction index is the average, weighted by the proportion of the metropolitan area’s minority group living in tract i, of the majority group’s proportion in each tract or areal unit i. It can be interpreted as representing the probability that the next person that a minority resident of a neighborhood or tract encounters will belong to the majority population. n P* l x y
"
[(xi\X )(yi\ti)]
Conversely, the isolation index measures the likelihood that minority residents in a tract encounter only each other, and is calculated as the minorityweighted average of the minority group’s proportion in each other or areal unit i
higher densities. If the minority population was uniformly distributed throughout the city or metropolitan area, the proportion of the total minority population in tract or areal unit i would be equal to that tract’s proportion of the total area in the metropolitan area (ai\A). Delta measures concentration by aggregating the deviations from this expectation
0
(9 :5 9 (t a \T#)k" (t a \T#):* n
n
ACO l 1k (xiai\X )k (tiai\T ) " " " n
n
P* l x x "
[(xi\X ) (xi\ti)]
Because the interaction and isolation indices depend on the composition of the population, xP*y and xP*x will be equal only if the majority and minority groups represent equal proportions of the population. The correlation ratio, Eta#, adjusts for the effects of proportion minority of the population, and indeed can be classified as an evenness measure. V l Eta# l ( xP*xkP)\(1kP)
n#
n"
i i
Concentration indices seek to measure the extent to which the minority population occupies less of the metropolitan area’s or city’s geographic area than the majority or total population, and hence also lives at
i i
The relative concentration index compares the relative concentration observed for the minority and majority populations with the maximum ratio possible, which occurs if the minority population were concentrated in the smallest possible area, and the majority population spread across the largest. RCO l
(9" (x a \X ):59" (y a \Y ):k1*5 9" (t a \T"):59 (t a \T#):k1* n
n
i i
n
i i
n
i i
3. Measures of Concentration
1
n
∆ l 0.5 Q (xi\X )k(ai\A) Q " Like the index of dissimilarity, the score on delta can be interpreted as representing the percentage of the minority group population that would have to move from areas of higher to lower than average density for the group to live at uniform densities across the metropolitan area. Massey and Denton (1988) proposed new measures of concentration. The absolute concentration index compares the observed distribution of the minority group in the metropolitan area against the theoretical minimum and maximum areas they could occupy. If tracts are ordered and the total population is cumulated from smallest to largest in area, the minimum area that could accommodate the minority group population is reached at tract n where the cumu" the total minority lated population reaches or exceeds population (X ) of the metropolitan area. Analogously, the maximum area that the minority population could occupy is obtained by cumulating the total population from the tracts largest in area to smallest, and finding the point n where the cumulated population equals or exceeds the# minority total.
n#
i i
The segregation indices usual range from 0.0 to 1.0. In contrast, the relative concentration index ranges from k1.0 to 1.0, with the negative values indicating that the minority population is less concentrated than the majority. 13793
Segregation Indices
4. Measures of Centralization Because older and cheaper housing is usually closer to the center of the city, and because housing discrimination prevented blacks and other minorities in the United States from suburbanizing for decades after whites did, minority populations are often concentrated in neighborhoods closer to the center of the city. Both centralization measures order the metropolitan area’s tracts or areal units by increasing distance from the central business district. The absolute centralization index aggregates the deviations of the cumulative proportion of the minority (Xi) reached at each tract i from the proportion of the land area reached by that tract n
n
ACE l (Xi− Ai)k (XiAi− ) " " " " The relative centralization index, aggregates the deviations of the cumulative proportions of the minority (Xi) and majority (Yi) populations reached at each tract i from what would be expected were they comparably distributed n
n
RCE l (Xi− Yi)k (XiAi− ) " " " " The ACE ranges from 0.0 to 1.0, but the RCE from k1.0 to 1.0, with negative values indicating that the minority population is less centralized than the majority.
5. Measures of Clustering The final dimension of residential segregation measures the degree to which tracts with disproportionately large minority populations adjoin one another, creating, for example, large ghettos or barrios, or are scattered throughout the metropolitan area. This is a form of the geographer’s contiguity and checkerboard problems, and drawing on this literature, Massey and Denton (1988) proposed as a measure of absolute clustering. ACL
( 9 : 9 :*5 (" 9(x \X)" (c t ):k9(X\n#)" "(c ):* n
n
n
n
l (xi\X) (cijxj) k (X\n#) (cijxj) i=" j=" " " n
n
i
n
ij j
SP l (XPxxjYPyy)\TPtt where n
n
Pxx l cijxjxj\X # i=" j=" and Pyy and Ptt are calculated by analogues of Pxx. Comparing the relative intragroup proximities of the minority and majority populations produces a measure of relative clustering. RCL l (Pxx\Pyy)k1 Clustering can also be measured by extending the concepts underlying the interaction and isolation measures to estimate how these probabilities should decay with distance. The distance–decay interaction and isolation indices can be constructed by aggregating, over each tract i, the probability that the next person one meets anywhere in the city is respectively a majority or minority resident from tract j, and that this probability exponentially decays with increasing distance between i and j. Thus n
DPxy l "
A
C
n
(xi\X) (kijyj\tij) B "
D
and n
DPxx "
A
C
n
(xi\X) (kijxj\tj) B "
D
where kij l [tij exp (kdij)]
5
A
n
(tj exp (kdij) B "
C
D
n
i= j=
ij
where cij l exp(kdij), that is, the negative exponential of the distance dij between the centroids of tracts i and j (and where dij l (0.6ai)!.&). The negative exponential is used to estimate the otherwise massive contiguity matrix whose elements equal 1.0 when tracts i and j are contiguous, and 0.0 when they are not. The formula takes the average number of minority residents in nearby tracts as a proportion of the total population in 13794
those tracts. The absolute clustering index has a minimum of 0.0 and can approach, but never reach, 1.0. The spatial proximity index compares the average proximity of minority group residents to one another (Pxx), and the majority group residents with one another (Pyy) to the average proximity among the residents in the total population (Ptt), weighted by the fraction that each group represents in the population. Thus
6. Conclusion Few empirical researchers have used more than one or two of these 20 indices in studying residential segregation, and the full spectrum remains of interest primarily to those devoted to constructing measures of the phenomenon in its varied dimensions. Massey and Denton (1988) used factor analysis to extract one index from each dimension and to define a concept of
Selection Bias, Statistics of hypersegregation for blacks in the United States based on their high segregation scores on each of the five dimensions. This has extended the use of measures beyond the dissimilarity and interaction\isolation indices. However, the literature is still far from developing an over-arching framework that might suggest how several measures might be used systematically to identify how urban areas vary in the five-dimensional space suggested or with what consequences. One might imagine, for example, identifying and comparing urban areas where high unevenness results from low integration in the suburbs rather than large ghettos; those with concentrated and perhaps centralized ghettos or barrios vs. others with comparably high unevenness or isolation but less clustered distributions of the minority population. The full set of indices has much potential value that remains to be developed. See also: Ethnic Conflicts; Locational Conflict (NIMBY); Population Composition by Race and Ethnicity: North America; Race and Urban Planning; Racial Relations; Residential Concentration\Segregation, Demographic Effects of; Residential Segregation: Sociological Aspects; Urban Geography; Urban Sociology
1.
Selection Bias
The following illustrates the basic problem. Suppose one hypothesizes the following relationship, in the population, between a dependent variable, Y*, and a set of j l 1,…, J explanatory variables, X y l xi βjεi i
(1)
Here the unit-specific values of Y* are denoted y i (where i denotes the ith observation) and each unit’s values of the X variables are contained in the vector xi. ε is a random error and the parameter vector, β, is to be estimated. Specific assumptions about the distribution of ε will determine the choice of estimation technique. If Y* is considered as a latent variable, a variety of different statistical models can be generated from Eqn. (1) depending on the manner in which Y* is actually observed (though still, at this point, assuming that Y* is observed, in some fashion, for all units). To do this, write a second equation that defines the relationship between the observed variable Y, and the latent variable, Y*. For example, where Y* is completely observed yi l y i
(2a)
but one might also have
Bibliography Duncan O D, Duncan B 1955 A methodological analysis of segregation indices. American Sociological Reiew 20: 210–17 Massey D S, Denton N A 1988 The dimensions of residential segregation. Social Forces 67: 281–315 Taueber K E, Taueber A F 1965 Negroes in Cities: Residential Segregation and Neighborhood Change. Aldine, Chicago
R. Harrison
Selection Bias, Statistics of In almost all areas of scientific inquiry researchers often want to infer, on the basis of sample data, the characteristics of a relationship that holds in the relevant population. Frequently this is the relationship between a dependent variable and one or more explanatory variables. In some cases, however, although one may have complete information on the explanatory variables, information on the dependent variable is lacking for some observations or ‘units.’ Furthermore, whether or not this information is present may not be conditionally independent (given the model) of the value taken by the dependent variable itself. This is a case of selection bias.
yi l 1
yi l 0
if y 0 i otherwise
(2b)
In the case of Eqn. (2(b)) Y is the binary observed realization of the underlying latent variable, Y*. More complex relationships between Y* and Y are also possible. To capture the idea of selection bias, suppose that there is another latent variable Z* such that whether or not we observe a value for Y depends on the value of Z* which is given by z l whi αjνi i
(3)
Here w is a vector of explanatory variables with coefficients α, and ν is a random error term. The observed variable Z is defined as zi l 1 zi l 0
if z i0 i otherwise
(4)
Finally, define the observation equation for Y as depending on Z* as follows yi l yi
if zi l 1
yi unobserved if zi l 0
(5) 13795
Selection Bias, Statistics of Equations (1), (2), and (5), together with Eqns. (3) and (4), link the latent variable Y* to its observed counterpart, Y, when observation of the latter depends on the value of another variable, Z*. Selection bias occurs when there is a nonzero covariance between the error terms ε and ν. More complex selection bias models can be generated by having, for instance, more than one Z* variable. The censored regression (or Tobit) model is a simpler case in which whether Y is observed or not depends on whether it exceeds (or, in other cases, falls below) a given threshold value.
2. The Problem Selection bias is a problem because if one tries to estimate β using normal statistical methods the resulting estimates will be poor and potentially misleading. For example, if Eqn. (2(a)) specifies the relationship between Y and Y*, discarding cases where Y is unobserved and running an OLS regression on the remainder will give estimates that are biased and inconsistent. Where might this kind of selection bias arise? Suppose one draws a random sample from a population and carries out a survey. Some respondents may refuse to answer some, though not all, questions. If the response to such a question played the role of dependent variable in a regression analysis there would be a selection bias problem if the probability of having responded to the item was not independent of the response value, given the specification of the regression model. This may not always be so: in the simplest case nonresponse might be random. Alternatively, response might be independent of the response value controlling for a set of measured covariates. In this case there is selection on observables (Heckman and Robb 1985). But often neither of these cases holds and a nonzero residual correlation between the selection Eqn. (3) and the outcome Eqn. (1) means that there is selection on unobservables. In studies of the criminal justice system in which the population is all those charged with a crime, sentence severity is observed only for those found guilty. In studies of the effectiveness of university education the outcome (say, examination results) is only observed for those who had completed that period of education. In studies of women’s earnings in paid work the earnings of women who are not in the labor force cannot be observed. To these examples could be added very many more. Selection bias is a pervasive problem. As a consequence a variety of methods has been proposed to deal with it. The problem of selection bias arises because of the nonzero correlation of the errors in Eqns. (1) and (3), and this arises commonly in the following context. Suppose there is a nonexperimental comparison of two groups, one exposed to some policy treatment, the 13796
other not. In this case the outcome of interest, Y, could be observed for members of both groups. Let Z now be the indicator of group membership (so that, for instance, Z l 0 means membership of the comparison group and Z l 1 means membership of the treatment group). In the case where Eqn. (2(a)) holds write the equation for Y as yi l xhi βjγzijεi
(6)
and interest would focus on the estimate of γ as the gross effect on the outcome measure of being in the treatment, rather than the comparison, group. The problem of selection bias still arises to the extent that ε and ν have a nonzero correlation. The difference between this and the type of selection bias discussed initially (namely that now Y is observed for both groups) is more apparent than real as an alternative formulation shows. For each unit Y is observed as the response to membership of either the treatment or comparison group, but not both. Let Y be the " outcome given treatment and Y given no treatment. ! Then the ith individual unit’s gain from treatment is simply ∆i l Y i–Y i. But ∆ cannot be measured since " ! one cannot simultaneously observe a unit’s values of Y and Y . Instead " ! Yi l ZY ij(1kZ)Y i " !
(7)
is observed. Eqn. (7) thus specifies the incomplete observation of two latent variables, Y and Y . This ! more " than set-up can easily be extended to cases with two groups. As might be expected, this approach is commonly encountered in the evaluation literature but one of its earliest formulations was by Roy (1951) as the problem of self-selection.
3. The Solutions Broadly speaking there are two ways to address problems of selection bias: ex ante by research design, and ex post by statistical adjustment. The modern literature on correcting for selection bias is itself biased towards the latter, beginning with the work of Heckman in the 1970s (Heckman 1976, 1979) whose so-called ‘two step’ method is widely used and hardwired into many econometrics programs. The argument that underlies this technique is the following. Assume the set-up as defined by Eqns. (1), (2(a)), (3), (4), and (5) and a nonzero covariance between ε and ν. Then we can write the ‘outcome equation’ (i.e., the regression equation for Y, given its observability) as E( yiQzi l 1, xi) l xhi βjE(εiQzi l 1)
(8)
Because there are only observations of Y when z l 1 there is an extra term for the conditional expectation
Selection Bias, Statistics of of ε. Because of the nonzero covariance between ε and ν and if, as is usually assumed, E(ε) l 0, then this conditional expectation cannot be zero. One can write E( yiQzi l 1, xi) l xhi βjE(εiQνi whiα)
(9)
Using a standard result for the value of a truncated bivariate normal distribution, the second term on the right-hand side of this equation is given by φ(whi α) E(εiQνi whi α) l ρσε σν Φ(whi α)
(10)
Here the σs are the standard deviations of the respective error terms from Eqns. (1) and (3); ρ is the correlation between them; φ and Φ are, respectively, the density and distribution functions of the standard normal and their ratio, as it appears in Eqn. (10), is termed the ‘inverse Mill’s ratio.’ Heckman’s two-step method requires first running a probit regression with Z as the dependent variable, estimated using all the observations in the data. Then, for those observations for which Y is observed, the probit coefficient estimates are used to compute the value of the inverse Mill’s ratio. This is then inserted as an extra variable in the OLS regression in which Y is the dependent variable to give E( yiQzi l 1, xi) l xhi βjθλV i
(11)
where λ is the inverse Mill’s ratio and the ‘hat’ indicates that it is estimated. Its coefficient, θ, is then itself an estimate of ρσεσν. This is the covariance between ε and ν (σν is set to unity in the probit). This approach is extended readily to the case in which an outcome is observed for both groups (Z l 0 and Z l 1). What are the assumptions of this approach and what are the properties of the resulting estimator when we apply this method to data from a sample of the population? First, it is assumed that all the other requirements for an OLS regression are met. Second, in the set-up outlined above, the joint distribution of ε and ν should be bivariate normal (though, in general, weaker assumptions suffice, namely that ν be normally distributed; and the expectation of ε, conditional on ν, should be linear in ν: (see Olsen 1980)). Given that the assumptions hold, the two-step estimator yields consistent estimates of the population β in Eqn. (1). The standard errors are incorrect (due to heteroscedasticity and the use of an estimated value of λ) but these are corrected readily. An alternative to the two-step approach is to estimate both the selection and outcome equations simultaneously using maximum likelihood (ML). The resulting estimates are asymptotically unbiased and more efficient than those from the two-step method (see Nelson 1984, who compares OLS, the two-step method and ML).
This two-step method is, in any case, applicable only when the relationship between Y* and Y is as given by Eqn. (2(a)). When this relationship is given by, for example, Eqn. (2(b)), the two-stage method is inconsistent. In these cases ML is the most feasible option. If the outcome variable is itself binary, the joint log-likelihood of the selection and outcome equations has the same general as for a bivariate probit but one in which there are three, rather than four, possible outcomes. They are z l 1 and y l 1; z l 1 and y l 0; and z l 0 (in which case y is not observed). Although the two-step method is probably the most widely used approach to correcting for selection bias it has been subjected to much criticism. Among the main objections are the following: (a) sensitiity to distributional assumptions. Practical implementation of the method renders it particularly sensitive in this respect. If the assumptions are not met the estimator has no desirable properties (i.e., it is not even consistent). (b) identification and robustness. It is common to find that the estimate of the inverse Mill’s ratio is highly correlated either with the explanatory variables or with the intercept in Eqn. (11). The extent to which such problems will arise depends mainly on three things. They are: (i) the specification of the selection equation; (ii) the sample correlation between the two sets of explanatory variables, X and W (call this q); (iii) the degree of sample selection in the sample ( p, the proportion of cases for which z l 1). For example, if X and W are identical, the twoequation system is identified only because of the nonlinearity of the probit equation. But for some ranges of p the probit function is almost linear, with the result that the estimated λ will be close to a linear function of the explanatory variables in the probit; and thus it will be highly correlated with the X variables in the outcome equation. In general, the correlation between the inverse Mill’s ratio estimate and the other explanatory variables in this equation will be greater the greater is q and the closer is p to zero or one (this issue is discussed in detail in Breen 1996). If the selection equation does not discriminate well between the selected and unselected observations the estimated inverse Mill’s ratio will be approximately a constant, and there will therefore be a high correlation between it and the intercept of the outcome equation. Both these objections reflect genuine difficulties in applying the model. In principle the solution to the identification and robustness problems is simple: ensure that the probit equation discriminates well and do not rely on the nonlinearity of the probit for identification. In practice it may be rather difficult to achieve these things. On the other hand, the issue of distributional assumptions may be even less tractable, obliging recourse to semiparameteric or other approaches (Lee 1994; Cosslett 1991). 13797
Selection Bias, Statistics of An alternative is to try to assess the likely degree of bias that sample selection might induce in a given case. Rubin (1977) presents a Bayesian approach in which the investigator computes the likely degree of bias, conditional on a prior belief about the relationship between the parameters of the distribution of Y in the selected and nonselected samples. The mean and the variance of the latter sample might, for example, be expressed as a function of their values in the selected sample, conditioning on the values of observed covariates. It is then straightforward to express the extent of selection bias for plausible values of the parameters of this function and to place a corresponding Bayesian probability interval around any estimates. Rosenbaum (1995, 1996) suggests and illustrates other sensitivity analyses for selection bias in nonexperimental evaluations.
4.
Program Ealuation and Selection Bias
For several years now the literature on selection bias has been dominated by discussion of how to evaluate programs (such as job training programs) when randomized assignment to treatment and control group is not possible. In a very widely cited paper, Lalonde (1986) compared the measured effectiveness of labor market programs using a randomized assignment design with their effectiveness computed using the Heckman method and showed that they led to quite different results (but see also Heckman and Hotz 1989). While some have seen this as a fatal criticism of the method, one consequence has been the development of greater awareness of the need to ensure the suitability of different selection bias correction approaches for specific cases. Another has been a greater concern to identify other possible sources of bias in nonrandomized program evaluations. In the statistical literature, matching methods commonly are advocated in program evaluations in the absence of randomization. They involve the direct comparison of treated and untreated units that share common, or very similar, values on a set of relevant covariates, X. The rationale for this is the assumption that, conditional on X, and assuming only selection on observables (all of which are in X ), the observed outcome for nonparticipants has the same distribution as the unobserved outcome that participants would have had had they not participated (see Heckman et al. 1997). This then allows one to estimate ∆. In passing, note that matching is also used commonly to impute missing values in item nonresponse in surveys (Little and Rubin 1987). A central issue is how to carry out such matching. If there are K covariates, then the problem is one of matching pairs, or sets, of participants and nonparticipants in this K-dimensional space. But by a result due to Rosenbaum and Rubin (1983) (see also Rosenbaum 1995), matching using a scalar quantity called the propensity score is equally 13798
effective. The propensity score is simply the probability of being in the treatment, rather than the comparison group, given the observed covariates. Matching on the propensity score controls bias due to all observed covariates and, even if there is selection on unobservables, the method nevertheless produces treatment and comparison groups with the same distribution of X variables. This is so under the assumption that the true propensity score is known, but Rosenbaum and Rubin (1984) show that an estimated propensity score (typically using a logit model) appears to perform at least as well. The use of matching draws attention to the problem of the possibly different distributions of covariates (or, equally, of propensity scores) in the treatment and comparison groups. There are two important aspects: first, the support of the propensity score may be different, so some ranges of propensity score values may be present in one group and not the other. More simply, some participants may have no comparable non-participants. Second, the distributions of the set of common values of propensity scores (i.e., which appear in both groups) may be different (Heckman et al. 1997, 1998). Hitherto, in practice, both of these sources of bias in evaluating the program’s effects commonly would have been confounded with selection bias proper. The use of propensity scores can remove the former but, if there is also selection on unobservables (i.e., selection bias proper), matching cannot be expected to solve the problem. Indeed, it is possible that these different forms of bias may have opposite signs, so that correcting for some but not others may make the overall bias greater. Nevertheless, the ability to draw such finer distinctions among biases is a valuable development, not least because it allows one to focus on methods for controlling selection bias, free from the contaminating effects of other biases.
5. Conclusion This article has only skimmed the surface of selection bias statistics. In particular, it has looked only at models for cross-sectional data: other approaches are possible given other sorts of data. For example, given longitudinal data, with observations of Y for each unit at two (or more) points in time, one prior to the program to be evaluated, the other after it, then it may be possible to specify and test a model which assumes that unobserved causes of selection bias are unitspecific and time-invariant and can therefore be removed by differencing (Heckman and Robb 1985, Heckman et al. 1997, 1998). Considerations of this sort draw attention to the ex ante approach to dealing with selection bias that was referred to earlier. Here the emphasis falls on designing research so as to remove, or at least reduce, selection bias. Random assignment is one possibility, though not always feasible in practice. Nevertheless, quasi-experimental approaches, interrupted time-series, and longitudinal
Self-concepts: Educational Aspects designs with repeated measures on the outcome variable are among a range of possibilities that could be considered in designing ones research to minimize selection bias problems (see Rosenbaum 1999 and associated comments). Even so, there will still be many instances in which analysts use data over whose collection they have had no control and where the use of ex post adjustments will be unavoidable. To conclude: much work on the selection bias problem has been undertaken since the 1980s yet there is no widespread agreement on which statistical methods are most suitable for use in correcting for the problem. Some broad guidelines for practitioners can, however, be discerned: (a) the need to ensure that methods to correct for selection bias are appropriate and that their requirements (such as distributional properties) are met; (b) the need to be aware of other, possibly confounding, sources of bias; (c) the usefulness of analyses of the sensitivity of conclusions to various possible magnitudes of selection bias; and of the sensitivity of the selection-bias corrected results to the assumptions of whatever method is employed; and (d) the desirability of designing research so that selection bias problems are, as far as possible, eliminated without the need for complex and sometimes fragile ex post adjustment. See also: Longitudinal Research: Panel Retention; Mortality Differentials: Selection and Causation; Nonequivalent Group Designs; Screening and Selection
job training programme. Reiew of Economic Studies 64: 605–54 Heckman J J, Robb R 1985 Alternative methods for evaluating the impact of interventions. In: Heckman J J, Singer B (eds.) Longitudinal Analysis of Labor Market Data. Cambridge University Press, New York Lalonde R J 1986 Evaluating the econometric evaluations of training programs with experimental data. American Economic Reiew 76: 604–20 Lee L F 1994 Semi-parametric two-stage estimation of sample selection models subject to Tobit-type selection rules. Journal of Econometrics 61: 305–44 Little R D, Rubin D B 1987 Statistical Analysis with Missing Data. Wiley, New York Nelson F D 1984 Efficiency of the two-step estimator for models with endogenous sample selection. Journal of Econometrics 24: 181–96 Olsen R J 1980 A least squares correction for selectivity bias. Econometrica 48: 1815–20 Rosenbaum P R 1995 Obserational Studies. Springer-Verlag, New York Rosenbaum P R 1996 Observational studies and nonrandomized experiments. In: Ghosh S, Rao C R (eds.) Handbook of Statistics. Elsevier, Amsterdam, Vol. 13 Rosenbaum P R 1999 Choice as an alternative to control in observational studies. Statistical Science 14: 259–304 Rosenbaum P R, Rubin D B 1983 The central role of the propensity score in observational studies for causal effects. Biometrika 70: 41–55 Rosenbaum P R, Rubin D B 1984 Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association 79: 516–24 Roy A D 1951 Some thoughts on the distribution of earnings. Oxford Economic Papers 3: 135–46 Rubin D B 1977 Formalizing subjective notions about the effect of nonrespondents in sample surveys. Journal of the American Statistical Association 72: 538–43
R. Breen
Bibliography Breen R 1996 Regression Models: Censored, Sample-selected or Truncated Data. Sage, Thousand Oaks, CA Cosslett S 1991 Semiparametric estimation of a regression model with sample selectivity. In: Barnett W A, Powell J, Tauchen G (eds.) Nonparametric and Semiparametric Methods in Econometrics and Statistics. Cambridge University Press, Cambridge, UK Heckman J J 1976 The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. Annals of Economic and Social Measurement 5: 475–92 Heckman J J 1979 Sample selection bias as a specification error. Econometrica 47: 153–61 Heckman J J, Hotz V J 1989 Choosing among alternative nonexperimental methods for estimating the impact of social programs: The case of manpower training. Journal of the American Statistical Association 84(408): 862–74 Heckman J J, Ichimura H, Smith J, Todd P E 1998 Characterizing selection bias using experimental data. Econometrica 66: 1017–98 Heckman J J, Ichimura H, Todd P E 1997 Matching as an econometric evaluation estimator: Evidence from evaluating a
Self-concepts: Educational Aspects The capacity to reflect one’s own capabilities and actions is uniquely human. Early in life, young children begin to form beliefs about themselves which may serve as reference mechanisms for perceiving the world and oneself, and for regulating emotion, motivation, and action. Research on self-related beliefs can be traced back to the seminal writings of William James (1893) who distinguished the self as ‘I’ (‘existential self’) and as ‘me’ (‘categorical self’), the latter implying cognitions about the self as an object of thinking (as represented in a sentence like ‘I think about me’). Research was continued throughout the decades after that and gained a central status in personality and social psychology as well as educational research after the cognitive paradigm shift in 13799
Self-concepts: Educational Aspects the 1950s and 1960s. At that time, behavioristic approaches were gradually replaced by cognitive perspectives, thus making it possible to acknowledge the importance of cognitive processes in human agency. Since then, the number of studies on selfconcepts has increased drastically. This also applies to studies on the educational relevance of self-concepts (Hansford and Hattie 1982, Helmke 1992).
1. Definition of the Term ‘Self-concept’ The term ‘self-concept’ is used in two interrelated ways. Talking about the self-concept of a person has to be differentiated from referring to a number of different self-concepts of this person. Usage of the term in self-concept research implies that ‘the’ selfconcept may be defined as the total set of a person’s cognitive representations of him- or herself which are stored in memory. In the plural sense, different self-concepts of a person are subsets of such representations relating to different self-attributes (like abilities, physical appearance, social status, etc.). In both variants, the term implies that self-concepts are self-representations which are more or less enduring over time, in contrast to situational self-perceptions. The definition is open as to whether these representations refer to factual reality of personal attributes, or to fictitious (e.g., possible or ideal) attributes. There are two definitional problems which remain conceptually unresolved and impede scientific progress. First, it is unclear whether self-related emotions should be included or not. One may argue that emotions are different from cognition and should be regarded as separate constructs. From such a perspective, (cognitive) self-concepts and (emotional) selfrelated feelings might be subsumed under umbrella constructs like self-esteem, comprising both cognitive and affective facets, but should not be mixed up conceptually. On the other hand, prominent measures of the construct take more integrative perspectives by combining cognitive and affective self-evaluative items. One example are H. Marsh’s widely used SelfDescription Questionnaires (SDQ: Marsh 1988) measuring academic self-concepts both by cognitive items (e.g., ‘I am good at mathematics’) and by affective items (e.g., ‘I enjoy doing work in mathematics).’ Second, there is no common agreement on the conceptual status of self-related cognitions linking the self to own actions and the environment. An example are self-efficacy beliefs pertaining to own capabilities to be successful in solving specific types of tasks, thus cognitively linking the self (own capabilities) to own actions (task performance) and to environmental demands (a domain of tasks; cf. Pajares 1996; see Self-efficacy: Educational Aspects). It may be argued that such self-representations should be regarded as part of a person’s self-concept as well. However, terms 13800
like self-concept, on the one hand, and self-efficacy beliefs, control beliefs etc., on the other, have been used as if they were relating to conceptually distinct phenomena, and have been addressed by different traditions of research. These research traditions tend to mutually ignore each other in spite of overlapping constructs and parallel findings, implying that more ‘cross-talk’ among researchers should take place in order to reduce conceptual confusion and the proliferation of redundant constructs which still prevails at the beginning of the twenty-first century (Pajares 1996).
2. Facets and Structures of Self-concepts Self-related cognitions can refer to different attributes of the self, and can imply descriptive or evaluative perspectives pertaining to the factual or nonfactual (e.g., ideal) reality of these attributes. Research has begun to analyze the structures of these different representational facets.
2.1 Representations of Attributes: Hierarchies of Self-concepts Self-concepts pertaining to different attributes may vary along two dimensions: the domain of attributes which is addressed, and their generality. It may be theorized that self-concepts are hierarchically organized along these two dimensions in similar ways as semantic memory networks can be assumed to be, implying that more general self-concepts are located at top levels of the self-concept hierarchy, and more specific self-concepts at lower levels. In educational research, the hierarchical model put forward by Shavelson et al. (1976) stimulated studies on facets of self-concepts differing in generality. This model implied that a ‘general self-concept’ is located at the top of the hierarchy; an ‘academic self-concept’ pertaining to one’s academic capabilities as well as social, emotional, and physical nonacademic self-concepts at the second level; self-concepts relating to different school subjects and to social, emotional, and physical subdomains at the third level; and more specific selfconcepts implying evaluations of situation-specific behavior at lower levels. Studies showed that correlations between academic self-concepts pertaining to different subjects tend to be near zero, in contrast to performance which typically is correlated positively across academic domains. An explanation has been provided by Marsh’s internal– external frame of reference model (I\E model) positing that the formation of academic self-concepts may be based on internal standards of comparison implying within-subject comparison of abilities across domains, as well as external standards implying between-
Self-concepts: Educational Aspects subjects social comparison with other students (Marsh 1986). Applying internal standards may lead to negative correlations between self-concepts pertaining to different domains (e.g., perceiving high ability in math may lead to lowered estimates of verbal abilities, and vice versa). External standards would imply positive correlations since achievement in domains like mathematics and languages is positively correlated across students. The model assumes that students use both types of standards, implying that opposing effects may cancel out. Accordingly, the Shavelson et al. (1976) model has been reformulated by assuming that students hold math-related and verbal-related academic self-concepts, but no general academic self-concept (Marsh and Shavelson 1985).
2.2 Different Perspecties on Attributes Self-representations can imply descriptive as well as evaluative accounts of attributes (e.g., ‘I am tall’ vs. ‘I am an attractive woman’). In academic self-concepts, this distinction may often be blurred because descriptions of academic competence may use comparison information implying some evaluation as well (e.g., ‘I am better at math than most of my classmates’). Any subjective evaluation of academic competence may use a number of different standards of evaluation, e.g. (a) social comparison standards and (b) intraindividual comparison across domains as addressed by Marsh’s I\E model (see above), as well as (c) intraindividual comparison of current and past competence (implying an evaluation of one’s academic development), (d) mastery-oriented comparison relating one’s competence to content-based criteria of minimal or optimal performance in an academic domain, and (e) cooperative standards linking individual performance to group performance. Beyond existing attributes (‘real self’), self-representations can pertain to attributes of the self which do not factually exist, but might exist (‘possible selves’), are wanted by oneself (‘ideal self’), wanted by others (‘ought self’), etc. Concepts of nonreal selves may be as important for affect and motivated self-development as concepts of the real self. For example, perceived discrepancies between ideal and real self may be assumed to trigger depressive emotion, whereas discrepancies between ought and real self may give rise to anxiety (Higgins 1987).
3. Self-concepts and Academic Achieement The relation between self-concept and academic achievement is one of the most often analyzed problems in both self-concept and educational research. In the first stage of research on this problem, connections between the two constructs were analyzed by correlational
methods based on cross-sectional field studies. Results implied that self-concepts and achievement may be positively linked. However, the magnitude of the correlation proved to depend on the self-concept and achievement constructs used. For example, in the meta-analysis provided by Hansford and Hattie (1982), the average correlation between self-concept and achievement\performance measures was r l 0.22 for general self-esteem, and r l 0.42 for self-concept of ability. When both self-concepts and achievement are measured in more domain-specific ways, correlations tend to be even higher (e.g., correlations for self-concepts and academic achievement in specific school subjects). This pattern of relations implies that the link gets closer when self-concept and criterion measure are matched, and when both are conceptualized in domain-specific ways. In the second stage, researchers began to analyze the causal mechanisms producing these relations. From a theoretical perspective, the ‘skill development’ model maintained that academic self-concepts are the result of prior academic achievement, whereas the ‘selfenhancement’ approach implied that self-concepts influence students’ achievement (cf. Helmke and van Aken 1995). In a number of longitudinal studies, both hypotheses were tested competitively by using crosslagged panel analysis and structural equations modeling. Results showed that both hypotheses may be valid, thus implying that self-concepts and achievement may be linked by reciprocal causation: Prior academic achievement exerted effects over time on subsequent academic self-concepts, and self-concepts influenced later achievement, although effect sizes for the latter causal path were less strong and consistent (cf. Helmke and van Aken 1995, Marsh and Yeung 1997). This evidence suggests that academic selfconcepts are partly based on information about own academic achievement, and may contribute to later academic learning and achievement. Not much is known about the exact nature of these mechanisms. Investigators have just begun to explore possible mediators and moderators of self-concept\achievement relations. Judging from theory and available evidence, the following may be assumed.
3.1
Effects of Achieement on Self-concepts
Academic feedback of achievement (e.g., by grades) may underly the formation of self-representations of academic capabilities. The match between feedback and self-representations may depend on the cumulativeness and consistency of feedback, its salience, the reference norms used, and the degreee of congruency with competing information from parents, peers, or one’s past. This implies that the impact of achievement on self-concepts may differ between schools, teachers, and classrooms using different standards and practices 13801
Self-concepts: Educational Aspects of feedback (cf. Helmke 1992). Finally, beyond exerting effects of academic self-concepts, achievement feedback can be assumed to influence students’ general sense of self-worth (Pekrun 1990), thus affecting students’ overall psychological health and personality development as well.
3.2
Effects of Self-concepts on Achieement
Self-concepts may influence the formation of selfefficacy and success expectations when being confronted with academic tasks. These expectations may underly academic emotions and motivation, which may in turn influence effort, strategies of learning, and on-task behavior directly affecting learning and performance (Pekrun 1993). Research has shown that selfconcepts implying moderate overestimation of own capabilities may be optimal for motivation and learning gains, whereas an underestimation may be rather detrimental. However, self-evaluations precisely reflecting reality may also be suboptimal for learning and achievement, indicating that there may be a conflict between educational goals of teaching students to be self-realistic vs. highly motivated (Helmke 1992).
4. Deelopment of Self-concepts 4.1 Basic Needs Underlying Self-concept Deelopment Two classes of basic human needs seem to govern the development of self-representations (cf. Epstein 1973). One are general needs for maximizing pleasure and minimizing pain, from which needs for self-enhancementfollow,implyingmotivationtoestablishand maintain a positive view of the self. The second category comprises needs to perceive reality and foresee the future in such ways that adaptation and survival are possible, thus implying needs for selfperceptions which are consistent with reality, and consistent with each other (needs for consistency). Self-enhancement and consistency may converge when feedback about the self is positive. However, they may be in conflict when feedback is negative: The need for self-enhancement would suggest not accepting negative feedback, whereas needs for reality-oriented consistency would imply endorsing it.
4.2 Deelopmental Courses across the Life Span The development of self-concepts is characterized by the interplay of mechanisms driven by these conflicting basic needs. In general, in the absence of strong information about reality, humans tend to overe13802
stimate their capabilities in self-enhancing ways. At the same time, they search for self-related information, and tend to endorse this information even if it is negative on condition that such negative information is salient, consistent, and cumulative. For example, it has repeatedly been found that many children entering school drastically overestimate their competences to master academic demands, but adjust their selfevaluations downward when diverging cumulative feedback is given (Helmke 1992). The process of realistically interpreting self-related feedback may be impeded in young children by their lack of sufficiently sophisticated metacognitive competences. Nevertheless, the interplay of beginners’ optimism and experts (relative) realism may characterize not only the early elementary years, but later phases of entering new institutions as well (university, marriage, a new job etc.), thus implying a dynamic interplay of self-related optimism and realism across much of the life span.
4.3
Impact of Educational Enironments
The influence of social environments was addressed early in the twentieth century by symbolic interactionism postulating that interactions with significant others may shape the development of self-concepts (cf. the concept of the ‘looking glass self,’ Cooley 1902). Generally, a number of environmental variables may be influential. Self-concepts can develop according to direct attributions of traits and personal worth by other persons on condition that such attributions are consistent with other sources of information and interpreted accordingly. A second source of selfrelated information are indirect, implicit attributions which are conveyed by others’ emotional and instrumental behavior towards the developing person. Of specific importance are acceptance and support by others implying attributions of personal worth, thereby influencing the development of a person’s general self-esteem (Pekrun 1990). Beyond attributions, social environments may define situational conditions for the development of knowledge, skills, motivation, and behavior, which may in turn contribute to self-perceptions of own competences. For example, instruction may build up knowledge and skills influencing knowledge-specific academic self-concepts; support of autonomy may foster the acquisition of self-regulatory abilities and, thereby, the development of related self-concepts of abilities; and consistent behavioral rules may enhance the cognitive predictability of students’ environments which may also positively affect their overall sense of competence. As outlined above, concerning academic self-concepts, feedback of achievement implying information about abilities and effort may be of specific importance, and the effects of such feedback may depend on
Self-conscious Emotions, Psychology of the reference norms used. Feedback given according to competitive, interindividually referenced norms implies that the self-concepts of high achievers may benefit, whereas low achievers may have difficulties protecting their self-esteem. In contrast, intraindividual, mastery-oriented and cooperative norms may be better suited to foster low achievers’ self-concepts as well. Finally, classroom conditions may also be influential. Specifically, grouping of students may influence students’ relative, within-classroom achievement positions, thus affecting their academic selfconcepts when competitive standards of grading are used. For example, being in a low-ability class would help an average student to maintain positive academic self-concepts, whereas being in a class of highly gifted students would enforce downward adjustment of selfevaluation (‘big-fish-little-pond effect,’ Marsh 1987).
5. Summary of Implications for Educational Practice Fostering students’ self-concepts may be beneficial for their achievement and personality development. Furthermore, since positive self-esteem may be regarded as a key element of psychological health and wellbeing, nurturing self-esteem may be regarded as an educational goal which is valuable in itself. From the available research summarized above, it follows that education may contribute substantially to self-concept development. Concerning parents as well as academic institutions like the school, acceptance and support may be of primary importance for the development of general self-esteem. In addition, giving students specific feedback on achievement, traits, and abilities may help in shaping optimistic self-concepts and expectancies which nevertheless are grounded in reality. In designing feedback, it may be helpful to use mastery-oriented, individual, and cooperative reference norms instead of relying on social comparison standards and competitive grading. Furthermore, educational environments may help students’ selfconcept development by providing high-quality instruction, consistent normative and behavioral structures implying predictability, as well as sufficient autonomy support fostering students’ sense of controllability and competence. See also: Motivation, Learning, and Instruction; School Achievement: Cognitive and Motivational Determinants; Schooling: Impact on Cognitive and Motivational Development; Self-development in Childhood; Self-efficacy; Self-efficacy: Educational Aspects; Self-regulated Learning
Bibliography Bracken B A (ed.) 1996 Handbook of Self-concept. Wiley, New York
Cooley C H 1902 Human Nature and the Social Order. Scribner, New York Covington M V 1992 Making the Grade. Cambridge University Press, New York Covington M V, Beery R 1976 Self-worth and School Learning. Holt, Rinehart and Winston, New York Epstein S 1973 The self-concept revisited or a theory of a theory. American Psychologist 28: 404–16 Hansford B C, Hattie J A 1982 The relationship between self and achievement\performance measures. Reiew of Educational Research 52: 123–42 Hattie J 1992 The Self-concept. Erlbaum, Hillsdale, NJ Helmke A 1992 Selbstertrauen und schulische Leistungen. Hogrefe, Go$ ttingen, Germany Helmke A, van Aken M A G 1995 The causal ordering of academic achievement and self-concept of ability during elementary school: A longitudinal study. Journal of Educational Psychology 87: 624–37 Higgins E T 1987 Self-discrepancy: A theory relating self and affect. Psychological Reiew 94: 319–40 James W 1893 Psychology. Fawcett, New York Kulik J A, Kulik C L 1992 Meta-analytic findings on grouping processes. Gifted Child Quarterly 36: 73–7 Marsh H W 1986 Verbal and math self-concepts: An internal\external frame of reference model. American Educational Research Journal 23: 129–49 Marsh H W 1987 The big-fish-little-pond effect on academic self-concept. Journal of Educational Psychology 79: 280–95 Marsh H W 1988 Self Description Questionnaire I (SDQ): Manual and Research Monograph. Psychological Corporation, San Antonio, TX Marsh H W, Shavelson R J 1985 Self-concept: Its multifaceted, hierarchical structure. Educational Psychologist 20: 107–25 Marsh H W, Yeung A S 1997 Causal effects of academic selfconcept on academic achievement: Structural equation models of longitudinal data. Journal of Educational Psychology 89: 41–54 Pajares F 1996 Self-efficacy beliefs in academic settings. Reiew of Educational Research 66: 543–78 Pekrun R 1990 Social support, achievement evaluations, and self-concepts in adolescence. In: Oppenheimer L (ed.) The Self-concept. Springer, Berlin, pp. 107–19 Pekrun R 1993 Facets of adolescents’ academic motivation: A longitudinal expectancy-value approach. In: Maehr M, Pintrich P (eds.) Adances in Motiation and Achieement. JAI Press, Greenwich, CT, Vol. 8, pp. 139–89 Shavelson R J, Hubner J J, Stanton G C 1976 Self-concept: Validation of construct interpretations. Reiew of Educational Research 46: 407–41 Skaalvik E M, Rankin R J 1995 A test of the internal\external frame of reference model at different levels of math and verbal self-perception. American Educational Research Journal 32: 161–84
R. Pekrun
Self-conscious Emotions, Psychology of Shame, guilt, embarrassment, and pride are members of a family of ‘self-conscious emotions’ that are evoked by self-reflection and self-evaluation. This self-evaluation may be implicit or explicit, consciously ex13803
Self-conscious Emotions, Psychology of perienced or transpiring beyond our awareness. But either way, the self is the object of self-conscious emotions. In contrast to ‘basic’ emotions (e.g., anger, fear, joy) which are present very early in life, self-conscious emotions have been described as ‘secondary,’ ‘derived,’ or ‘complex’ emotions because they emerge later and hinge on several cognitive achievements— recognition of the self separate from others and a set of standards against which the self is evaluated. For example, Lewis et al. (1989) showed that the capacity to experience embarrassment coincides with the emergence of self-recognition. Very young children first show behavioral signs of embarrassment during the same developmental phase (15–24 months) in which they show a rudimentary sense of self. Moreover, within this range, children who display self-recognition (in a ‘rouge’ test) are the same children who display signs of embarrassment in an unrelated task.
1. Shame and Guilt The terms ‘shame’ and ‘guilt’ are often used interchangeably. When people do distinguish the two, they typically suggest that shame arises from public exposure and disapproval of a failure or transgression, whereas guilt is more ‘private,’ arising from one’s own conscience. Recent empirical research has failed to support this public\private distinction. For example, Tangney et al. (1996) found that people’s real-life shame and guilt experiences are each most likely to occur in the presence of others. More important, neither the presence of others nor others’ awareness of the respondents’ behavior distinguished between shame and guilt. Overall, there are surprisingly few differences in the types of events that elicit shame and guilt. Shame is somewhat more likely in response to violations of social norms, but most types of events (e.g., lying, cheating, stealing, failing to help another) result in guilt for some people and shame for others. So what is the difference between shame and guilt? Empirical research supports Helen Block Lewis’s (1971) notion that shame involves a negative evaluation of the global self, whereas guilt involves a negative evaluation of a specific behavior. This differential emphasis on self (‘I did that horrible thing’) vs. behavior (‘I did that horrible thing’) leads to different affective experiences. Shame is an acutely painful emotion typically accompanied by a sense of shrinking, ‘being small,’ and by a sense of worthlessness and powerlessness. Shamed people also feel exposed. Although shame doesn’t necessarily involve an actual observing audience, there is often the imagery of how one’s defective self would appear to others. Not surprisingly, shame often leads to a desire to escape or hide—to sink into the floor and disappear. 13804
In contrast, guilt is typically less painful and devastating because the primary concern is with a specific behavior, not the entire self. Guilt doesn’t affect one’s core identity. Instead, there is tension, remorse, and regret over the ‘bad thing done’ and a nagging preoccupation with the transgression. Rather than motivating avoidance, guilt typically motivates reparative action.
1.1 Implications of Shame and Guilt for Interpersonal Adjustment Research has consistently shown that, on balance, guilt is the more adaptive emotion, benefiting relationships in a variety of ways (Baumeister et al. 1994, Tangney 1995). Three sets of findings illustrate the adaptive, ‘relationship-enhancing functions’ of guilt, in contrast to the hidden costs of shame. First, shame typically leads to attempts to deny, hide, or escape; guilt typically leads to reparative action—confessing, apologizing, undoing. Thus, guilt orients people in a more constructive, proactive, future-oriented direction, whereas shame orients people toward separation, distance, and defense. Second, a special link between guilt and empathy has been observed at the levels of both emotion states and dispositions. Studies of children, college students, and adults (Tangney 1995) show that guilt-prone individuals are generally empathic individuals. In contrast, shame-proneness is associated with an impaired capacity for other-oriented empathy and a propensity for ‘self-oriented’ personal distress responses. Similar findings are evident when considering feelings of shame and guilt ‘in the moment.’ Individual differences aside, when people describe personal guilt experiences, they convey greater empathy for others, compared to descriptions of shame experiences. It appears that by focusing on a bad behavior (as opposed to a bad self), people experiencing guilt are relatively free of the egocentric, self-involved focus of shame. Instead, their focus on a specific behavior is likely to highlight the consequences for distressed others. Third, there is a special link between shame and anger, again observed at both the dispositional and state levels. At all ages, shame-prone individuals are also prone to feelings of anger and hostility (Tangney 1995). Moreover, once angered, they tend to manage their anger in an aggressive, unconstructive fashion. In contrast, guilt is generally associated with constructive means of handling anger. Similar findings have been observed at the situational level, too. In a study of couples’ real-life episodes of anger, shamed partners were significantly more angry, more aggressive, and less likely to elicit conciliatory behavior from the offending partner (Tangney 1995). What accounts for this link between shame and anger? When feeling
Self-conscious Emotions, Psychology of shame, people initially direct hostility inward (‘I’m such a bad person’). But this hostility may be redirected outward in a defensive attempt to protect the self by shifting the blame elsewhere (‘Oh what a horrible person I am, and damn it, how could you make me feel that way!’). In sum, studies employing diverse samples, measures and methods converge. All things equal, it’s better if your friend, partner, child, or boss feels guilt than shame. Shame motivates behaviors that interfere with interpersonal relationships. Guilt helps keep people constructively engaged in the relationship at hand. 1.2 Implications of Shame and Guilt for Psychological Adjustment Although guilt appears to be the more ‘moral’ or adaptive emotion when considering social adjustment, is there a trade-off vis-a' -vis individual psychological adjustment? Does the tendency to experience guilt or shame leave one vulnerable to psychological problems? Researchers consistently report a relationship between proneness to shame and a whole host of psychological symptoms, including depression, anxiety, eating disorder symptoms, subclinical sociopathy, and low self-esteem (Harder et al. 1992, Tangney et al. 1995). This relationship is robust across measurement methods and diverse populations. There is more controversy regarding the relationship of guilt to psychopathology. The traditional view is that guilt plays a significant role in psychological symptoms. Clinical theory and case studies make frequent reference to a maladaptive guilt characterized by chronic self-blame and obsessive rumination over one’s transgressions. On the other hand, recent theory and research has emphasized the adaptive functions of guilt, particularly for interpersonal behavior. Tangney et al. (1995) argued that once one makes the critical distinction between shame and guilt, there’s no compelling reason to expect guilt over specific behaviors to be associated with poor psychological adjustment. The empirical research is similarly mixed. Studies employing adjective checklist-type (and other globally worded) measures find both shame-proneness and guilt-proneness correlated with psychological symptoms. On the other hand, measures sensitive to the self vs. behavior distinction show no relationship between proneness to ‘shame-free’ guilt and psychopathology. 1.3 Deelopment of Guilt Distinct from Shame The experience of self-conscious emotions requires the development of standards and a recognized self. In addition, a third ability is required to experience guilt (about specific behaviors) independent of shame (about the self)—the ability to make a clear distinction between self and behavior. Developmental re-
search indicates that children do not begin making meaningful distinctions between attributions to ability (enduring characteristics) vs. attributions to effort (more unstable, volitional factors) until about age eight—the same age at which researchers find interpretable differences in children’s descriptions of shame and guilt experiences (Ferguson et al. 1991).
2. Embarrassment Miller (1996) defines embarrassment as ‘an aversive state of mortification, abashment, and chagrin that follows public social predicaments’ (p. 322). Indeed, embarrassment appears to be the most ‘social’ of the self-conscious emotions, occurring almost without exception in the company of others. 2.1
Causes of Embarrassment
In Miller’s (1996) catalog of embarrassing events described by several hundred adolescents and adults, ‘normative public deficiencies’ (tripping in front of a large class, forgetting someone’s name, unintended bodily induced noises) were at the top of the list. But there were many other types of embarrassment situations—awkward social interactions, conspicuousness in the absence of any deficiency, ‘team transgressions’ (embarrassed by a member of one’s group), and ‘empathic’ embarrassment. The diversity of situations that lead to embarrassment has posed a challenge to efforts at constructing a comprehensive ‘account’ of embarrassment. Some theorists believe that the crux of embarrassment is negative evaluation by others (Edelmann 1981, Miller 1996). This social evaluation account runs into difficulty with embarrassment events that involve no apparent deficiency (e.g., being the center of attention during a ‘Happy Birthday’ chorus). Other theorists subscribe to the ‘dramaturgic’ account, positing that embarrassment occurs when implicit social roles and scripts are disrupted. A flubbed performance, an unanticipated belch, and being the focus of ‘Happy Birthday’ each represent a deviation from accustomed social scripts. Lewis (1992) distinguished between two types of embarrassment—embarrassment due to exposure and embarrassment due to negative self-evaluation. According to Lewis (1992), embarrassment due to exposure emerges early in life, once children develop a rudimentary sense of self. When children develop standards, rules and goals (SRGs), a second type of embarrassment emerges—‘embarrassment as mild shame’ associated with failure in relation to SRGs. 2.2 Functions of Embarrassment Although there is debate about the fundamental causes of embarrassment, there is general agreement about its 13805
Self-conscious Emotions, Psychology of adaptive significance. Gilbert (1997) suggests that embarrassment serves an important social function by signaling appeasement to others. When untoward behavior threatens a person’s standing in an important social group, visible signs of embarrassment function as a nonverbal acknowledgment of shared social standards, thus diffusing negative evaluations and the likelihood of retaliation. Evidence from studies of both humans and nonhuman primates supports this remedial function of embarrassment (Keltner and Buswell 1997).
2.3 Embarrassment and Shame Is there a difference between shame and embarrassment? Some theorists essentially equate the two emotions. A more dominant view is that shame and embarrassment differ in intensity of affect and\or severity of transgression. Still others propose that shame is tied to perceived deficiencies of one’s core self, whereas embarrassment results from deficiencies in one’s presented self. Recent research suggests that shame and embarrassment are indeed quite different emotions—more distinct, even, than shame and guilt. For example, comparing adults’ personal shame, guilt, and embarrassment experiences, Tangney et al. (1996) found that shame was a more intense, painful emotion that involved a greater sense of moral transgression. But controlling for intensity and morality, shame and embarrassment still differed markedly along many affective, cognitive, and motivational dimensions. When shamed, people felt greater responsibility, regret, and self-directed anger. Embarrassment was marked by more humor, blushing, and a greater sense of exposure.
2.4 Indiidual Differences in Embarrassability As with shame and guilt, people vary in their propensity to experience embarrassment. These individual differences are evident within the first years of life, and are relatively stable across time. Research has shown that embarrassability is associated with neuroticism, high levels of negative affect, self-consciousness, and a fear of negative evaluation from others. Miller (1996) has shown that this fear of negative evaluation is not due to poor social skills, but rather a heightened concern for social rules and standards.
3. Pride Of the self-conscious emotions, pride has received the least attention. Most research comes from developmental psychology, particularly in the achievement domain. 13806
3.1 Deelopmental Issues There appear to be substantial developmental shifts in the types of situations that induce pride, the nature of the pride experience itself, and the ways in which pride is expressed. For example, Stipek et al. (1992) observed developmental changes in the criteria children use for evaluating success and failure—and in the types of situation that lead to pride. Children under 33 months respond positively to task-intrinsic criteria (e.g., completing a tower of blocks), but they do not seem to grasp the concept of competition (e.g., winning or losing a race to complete a tower of blocks). It is only after 33 months that children show enhanced pride in response to a competitive win. There are also developmental shifts in the importance of praise from others. Stipek et al. (1992) reported that all children 13–39 months smiled and exhibited pleasure with their successes. But there were age differences in social referencing. As children neared two years of age, they began to seek eye contact with parents upon completing a task, often actively soliciting parental recognition, which in turn enhanced children’s pleasure with achievements. Stipek et al. (1992) suggests that the importance of external praise may be curvilinear across the lifespan. Very young children take pleasure in simply having immediate effect on their environment; as they develop selfconsciousness (at about two), others’ reactions shape their emotional response to success and failure. Still later, as standards become increasingly internalized, pride becomes again more autonomous, less contingent on others’ praise and approval. 3.2 Two Types of Pride? Both Tangney (1990) and Lewis (1992) have suggested that there are two types of pride. Paralleling the self vs. behavior distinction of guilt and shame, Tangney (1990) distinguished between pride in self (‘alpha’ pride) and pride in behavior (‘beta’ pride). Similarly, Lewis (1992) distinguished between pride (arising from attributing one’s success to a specific action) and hubris (pridefulness arising from attributions of success to the global self). Lewis (1992) views hubris as largely maladaptive, noting that hubristic individuals are inclined to distort and invent situations to enhance the self, which can lead to interpersonal problems.
4. Future Research Future research will no doubt focus on biological and social factors that shape individual differences in selfconscious emotions. In addition, we need to know more about the conditions under which guilt, shame, pride, and embarrassment are most likely to be adaptive vs. maladaptive. Finally, more cross-cultural research is needed. Kitayama et al. (1995) make the
Self-deelopment in Childhood compelling argument that, owing to cultural differences in the construction of the self, self-conscious emotions may be especially culturally sensitive. See also: Culture and Emotion; Emotion and Expression; Emotion, Neural Basis of; Emotions, Evolution of; Emotions, Psychological Structure of; Shame and the Social Bond
Tangney J P, Miller R S, Flicker L, Barlow D H 1996 Are shame, guilt and embarrassment distinct emotions? Journal of Personality and Social Psychology 70: 1256–69 Tangney J P, Wagner P, Gramzow R 1992 Proneness to shame, proneness to guilt, and psychopathology. Journal of Abnormal Psychology 101: 469–78
J. P. Tangney
Bibliography Baumeister R F, Stillwell A M, Heatherton T F 1994 Guilt: An interpersonal approach. Psychological Bulletin 115: 243–67 Edelmann R J 1981 Embarrassment: The state of research. Current Psychological Reiews 1: 125–38 Ferguson T J, Stegge H, Damhuis I 1991 Children’s understanding of guilt and shame. Child Deelopment 62: 827–39 Gilbert P 1997 The evolution of social attractiveness and its role in shame, humiliation, guilt, and therapy. British Journal of Medical Psychology 70: 113–47 Harder D W, Cutler L, Rockart L 1992 Assessment of shame and guilt and their relationship to psychopathology. Journal of Personality Assessment 59: 584–604 Keltner D, Buswell B N 1997 Embarrassment: Its distinct form and appeasement functions. Psychological Bulletin 122: 250–70 Kitayama S, Markus H R, Matsumoto H 1995 Culture, self, and emotion: A cultural perspective on ‘self-conscious’ emotion. In: Tangney J P, Fischer K W (eds.) Self-conscious Emotions: The Psychology of Shame, Guilt, Embarrassment, and Pride. Guilford Press, New York, pp. 439–64 Lewis H B 1971 Shame and Guilt in Neurosis. International Universities Press, New York Lewis M 1992 Shame: The Exposed Self. Free Press, New York Lewis M, Sullivan M W, Stanger C, Weiss M 1989 Selfdevelopment and self-conscious emotions. Child Deelopment 60: 146–56 Mascolo M F, Fischer K W 1995 Developmental transformation in appraisals for pride, shame, and guilt. In: Tangney J P, Fischer K W (eds.) Self-conscious Emotions: The Psychology of Shame, Guilt, Embarrassment, and Pride. Guilford, New York, pp. 64–113 Miller R S 1996 Embarrassment: Poise and Peril in Eeryday Life. Guilford Press, New York Stipek D J, Recchia S McClintic S 1992 Self-evaluation in young children. Monographs of the Society for Research in Child Development 57 (1, Serial No. 226) R5–R83 Tangney J P 1990 Assessing individual differences in proneness to shame and guilt: Development of the self-conscious affect and attribution inventory. Journal of Personality and Social Psychology 59: 102–11 Tangney J P 1995 Shame and guilt in interpersonal relationships. In: Tangney J P, Fischer K W (eds.) Self-conscious Emotions: The Psychology of Shame, Guilt, Embarrassment, and Pride. Guilford Press, New York, pp. 114–39 Tangney J P, Burggraf S A, Wagner P E 1995 Shame-proneness, guilt-proneness, and psychological symptoms. In: Tangney J P, Fischer K W (eds.) Self-conscious Emotions: The Psychology of Shame, Guilt, Embarrassment, and Pride. Guilford, New York, pp. 343–67
Self-development in Childhood In addition to domain-specific self-concepts, the ability to evaluate one’s overall worth as a person emerges in middle childhood. The level of such global self-esteem varies tremendously across children and is determined by how adequate they feel in domains of importance as well as the extent to which significant others (e.g., parents and peers) approve of them as a person. Efforts to promote positive self-esteem are critical, given that low self-esteem is associated with many psychological liabilities including depressed affect, lack of energy, and hopelessness about the future.
1. Introduction Beginning in the second year of life toddlers begin to talk about themselves. With development, they come to understand that they possess various characteristics, some of which may be positive (‘I’m smart’) and some of which may be negative (‘I’m unpopular’). Of particular interest is how the very nature of such selfevaluations changes with development as well as among individual children and adolescents across two basic evaluative categories, (a) domain-specific selfconcepts, i.e., how one judges one’s attributes in particular arenas, e.g., scholastic competence, social (for a complete treatment of self-development in childhood and adolescence, see Harter 1999). Developmental shifts in the nature of self-evaluations are driven by changes in the child’s cognitive capabilities (see Self-knowledge: Philosophical Aspects and Self-ealuatie Process, Psychology of). Cognitivedevelopmental theory and findings (see Piaget 1960, 1963, Fischer 1980) alert us to the fact that the young child is limited to very specific, concrete representations of self and others, for example, ‘I know my ABCs’ (see Harter 1999). In middle to later childhood, the ability to form higher-order concepts about one’s attributes and abilities (e.g., ‘I’m smart’) emerges. There are further cognitive advances at adolescence, allowing the teenager to form abstract concepts about 13807
Self-deelopment in Childhood the self that transcend concrete behavioral manifestations and higher-order generalizations (e.g., ‘I’m intelligent’).
2. Deelopmental Differences in Domain-specific Self-concepts
panied by parental messages that make the child feel inadequate, incompetent, and unlovable. Such children will also engage in all-or-none thinking but conclude that they are all bad.)
2.2 Middle to Later Childhood
Domain-specific evaluative judgments are observed at every developmental level. However, the precise nature of these judgments varies with age (see Table 1). In Table 1, five common domains in which children and adolescents make evaluate judgments about the self are identified: Scholastic competence, Physical competence, Social competence, Behavioral conduct, and Physical appearance. The types of statements vary, however, across three age periods, early childhood, later childhood, and adolescence, in keeping with the cognitive abilities of each age period.
2.1 Early Childhood Young children provide very concrete accounts of their capabilities, evaluating specific behaviors. Thus, they communicate how they know their ABCs, how they can run very fast, how they are nice to a particular friend, how they don’t hit their sister, and how they possess a specific physical feature such as pretty blond hair. Of particular interest in such accounts is the fact that the young child typically provides a litany of virtues, touting his or her positive skills and attributes. One cognitive limitation of this age period is that the young child cannot distinguish the wish to be competent from reality. As a result, they typically overestimate their abilities because they do not yet have the skills to evaluate themselves realistically. Another cognitive characteristic that contributes to potential distortions is the pervasiveness of all-or-none thinking. That is, evaluations are either all positive or all negative. With regard to self-evaluations, they are typically all positive. (Exceptions to this positivity bias can be observed in children who are chronically abused, since severe maltreatment is often accom-
As the child grows older, the ability to make higherorder generalizations in evaluating his or her abilities and attributes emerges. Thus, rather than cite prowess at a particular activity, the child may observe that he or she is good at sports, in general. This inference can further be justified in that the child can describe his or her talent at several sports (e.g., good at soccer, basketball, baseball). Thus, the higher-order generalization represents a cognitive construction in which an over-arching judgment (good at sports) is defined in terms of specific examples which warrant this conclusion. Similar processes allow the older child to conclude that he or she is smart (e.g., does well in math, science, and history). The structure of a higherorder generalization about being well behaved could include such components as obeying parents, not getting in trouble, and trying to do what is right. A generalization concerning the ability to make friends may subsume accounts of having friends at school, making friends easily at camp, and developing friendships readily upon moving to a new neighborhood. The perception that one is good-looking may be based on one’s positive evaluation of one’s face, hair, and body. During middle childhood, all-or-none thinking diminishes and the aura of positivity fades. Thus, children do not typically think that they are all virtuous in every domain. The more common pattern is for them to feel more adequate in some domains than others. For example, one child may feel that he or she is good at schoolwork and is well behaved, whereas he or she is not that good at sports, does not think that he or she is good-looking, and reports that it is hard to make friends. Another child may report the opposite pattern. There are numerous combinations of positive and negative evaluations across these domains that chil-
Table 1 Developmental changes in the nature of self-evaluative statements across different domains Early childhood specific behaviors
Later childhood generalizationsa
Adolescence abstractionsa
Scholastic competence Athletic competence Social competence Behavioral conduct
I know my A,B,C’s I can run very fast I’m nice to my friend, Jason I don’t hit my sister
I’m smart in school I’m good at sports It’s easy for me to make friends I’m well behaved
Physical appearance
I have pretty blond hair
I’m good looking
I’m intelligent I’m athletically talented I’m popular I think of myself as a moral person I’m physically attractive
Domains
aExamples
13808
in the table represent positive self-evaluations. However, during later childhood and adolescence, negative judgments are also observed
Self-deelopment in Childhood dren can and do report. Moreover, they may report both positive and negative judgments within a given domain, for example, they are smart in some school subjects (math and science) but ‘dumb’ in others (english and social studies). Such evaluations may also be accompanied by self-affects that also emerge in later childhood, for example, feeling proud of one’s accomplishments but ashamed of one’s perceived failures (see also Self-conscious Emotions, Psychology of). This ability to consider both positive and negative characteristics is a major cognitive–developmental acquisition. Thus, beginning in middle to later childhood, these distinctions result in a profile of selfevaluations across domains. Contributing to this advance is the ability to engage in social comparison. Beginning in middle childhood one can use comparisons with others as a barometer of the skills and attributes of the self. In contrast, the young child cannot simultaneously compare his or her attributes to the characteristics of another in order to detect similarities or differences that have implications for the self. Although the ability to utilize social comparison information for the purpose of selfevaluation represents a cognitive-developmental advance, it also ushers in new, potential liabilities. With the emergence of the ability to rank-order the performance of other children, all but the most capable children will necessarily fall short of excellence. Thus, the very ability and penchant to compare the self with others makes one’s self-concept vulnerable, particularly if one does not measure up in domains that are highly valued. The more general effects of social comparison can be observed in findings revealing that domain-specific self-concepts become more negative during middle and later childhood, compared to early childhood.
2.3 Adolescence For the adolescent, there are further cognitivedevelopmental advances that alter the nature of domain-specific self-evaluations. As noted earlier, adolescence brings with it the ability to create more abstract judgments about one’s attributes and abilities. Thus, one no longer merely considers oneself to be good at sports but to be athletically talented. One is no longer merely smart but views the self more generally as intelligent, where successful academic performance, general problem-solving ability, and creativity might all be subsumed under the abstraction of intelligence. Abstractions may be similarly constructed in the other domains. For example, in the domain of behavioral conduct, there will be a shift from the perception that one is well behaved to a sense that one is a moral or principled person. In the domains of social competence and appearance, abstractions may take the form of perceptions that one is popular and physically attractive.
These illustrative examples all represent positive self-evaluations. However, during adolescence (as well as in later childhood), judgments about one’s attributes will also involve negative self-evaluations. Thus, certain individuals may judge the self to be unattractive, unpopular, unprincipled, etc. Of particular interest is the fact that when abstractions emerge, the adolescent typically does not have total control over these new acquisitions, just as when one is acquiring a new athletic skill (e.g., swinging a bat, maneuvering skis), one lacks a certain level of control. In the cognitive realm, such lack of control often leads to overgeneralizations that can shift dramatically across situations or time. For example, the adolescent may conclude at one point in time that he or she is exceedingly popular but then, in the face of a minor social rebuff, may conclude that he or she is extremely unpopular. Gradually, adolescents gain control over these self-relevant abstractions such that they become capable of more balanced and accurate self-representations (see Harter 1999).
3. Global Self-esteem The ability to evaluate one’s worth as a person also undergoes developmental change. The young child simply is incapable, cognitively, of developing the verbal concept of his\her value as a person. This ability emerges at the approximate age of eight. However, young children exude a sense of value or worth in their behavior. The primary behavioral manifestations involve displays of confidence, independence, mastery attempts, and exploration (see Harter 1999). Thus, behaviors that communicate to others that children are sure of themselves are manifestations of high self-esteem in early childhood. At about the third grade, children begin to develop the concept that they like, or don’t like, the kind of person they are (Harter 1999, Rosenberg 1979). Thus, they can respond to general items asking them to rate the extent to which they are pleased with themselves, like who they are, and think they are fine, as a person. Here, the shift reflects the emergence of an ability to construct a higher-order generalization about the self. This type of concept can be built upon perceptions that one has a number of specific qualities; for example, that one is competent, well behaved, attractive, etc. (namely, the type of domain-specific selfevaluations identified in Table 1). It can also be built upon the observation that significant others, for example, parents, peers, teachers, think highly of the self. This process is greatly influenced by advances in the child’s ability to take the perspective of significant others (Selman 1980). During adolescence, one’s evaluation of one’s global worth as a person may be further elaborated, drawing upon more domains and sources of approval, and will also become more abstract. Thus, adolescents can directly acknowledge 13809
Self-deelopment in Childhood that they have high or low self-esteem, as a general abstraction about the self (see also Self-esteem in Adulthood).
her competence in domains where one had aspirations to succeed. Cooley focused on the salience of the opinions that others held about the self, opinions which one incorporated into one’s global sense of self (see Self: History of the Concept).
4. Indiidual Differences in Domain-specific Selfconcepts as well as Global Self-esteem Although there are predictable cognitively based developmental changes in the nature of how most children and adolescents describe and evaluate themselves, there are striking individual differences in how positively or negatively the self is evaluated. Moreover, one observes different profiles of children’s perceptions of their competence or adequacy across the various self-concept domains, in that children evaluate themselves differently across domains. Consider the profiles of four different children. One child, Child A, may feel very good about her scholastic performance, although this is in sharp contrast to her opinion of her athletic ability, where she evaluates herself quite poorly. Socially she feels reasonably well accepted by her peers. In addition, she considers herself to be well behaved. Her feelings about her appearance, however, are relatively negative. Child A also reports very high self-esteem. Another child, Child B, has a very different configuration of scores. This is a boy who feels very incompetent when it comes to schoolwork. However, he considers himself to be very competent, athletically, and feels well received by peers. He judges his behavioral conduct to be less commendable. In contrast, he thinks he is relatively good-looking. Like Child A, he also reports high self-esteem. Other profiles are exemplified by Child C and Child D, neither of whom feel good about themselves scholastically or athletically. They evaluate themselves much more positively in the domains of social acceptance, conduct, and physical appearance. In fact, their profiles are quite similar to each other across the five specific domains. However, judgments of their self-esteem are extremely different. Child C has very high self-esteem whereas Child D has very low selfesteem. This raises a puzzling question: how can two children look so similar with regard to their domainspecific self-concepts but evaluate their global selfesteem so differently? We turn to this issue next, in examining the causes of global self-esteem.
5. The Causes of Children’s Leel of Self-esteem Our understanding of the antecedents of global selfesteem have been greatly aided by the formulations of two historical scholars of the self, William James (1892) and Charles Horton Cooley (1902). Each suggested rather different pathways to self-esteem, defined as an overall evaluation of one’s worth as a person (see reviews by Harter 1999, Rosenberg 1979). James focused on how the individual assessed his or 13810
5.1 Competence–Adequacy in Domains of Importance For James, global self-esteem derived from the evaluations of one’s sense of competence or adequacy in the various domains of one’s life relative to how important it was to be successful in these domains. Thus, if one feels one is successful in domains deemed important, high self-esteem will result. Conversely, if one falls short of one’s goal in domains where one has aspirations to be successful, one will experience low selfesteem. One does not, therefore, have to be a superstar in every domain to have high self-esteem. Rather, one only needs to feel adequate or competent in those areas judged to be important. Thus, a child may evaluate himself or herself as unathletic; however, if athletic prowess is not an aspiration, then self-esteem will not be negatively affected. That is, the high selfesteem individual can discount the importance of areas in which one does not feel successful. This analysis can be applied to the profiles of Child C and Child D. In fact, we have directly examined this explanation in research studies by asking children to rate how important it is for them to be successful (Harter 1999). The findings reveal that high selfesteem individuals feel competent in domains they rate as important. Low self-esteem individuals report that areas in which they are unsuccessful are still very important to them. Thus, Child C represents an example of an individual who feels that social acceptance, conduct, and appearance, domains in which she evaluates herself positively, are very important but that the two domains where she is less successful, scholastic competence and athletic competence are not that important. In contrast, Child D rates all domains as important, including the two domains where he is not successful; scholastic competence and athletic competence. Thus, the discrepancy between high importance coupled with perceptions of inadequacy contribute to low self-esteem.
5.2 Incorporation of the Opinions of Significant Others Another important factor influencing self-esteem can be derived from the writings of Cooley (1902) who metaphorically made reference to the ‘looking-glass self’ (see Oosterwegel and Oppenheimer 1993). According to this formulation, significant others (e.g., parents and peers) were social mirrors into which one gazed in order to determine what they thought of the
Self-deelopment in Childhood 3.75
High support
3.50 Moderate support Low support
Self-esteem
3.25 3.00 2.75 2.50 2.25 2.00
Low Moderate High Average competence in important domains
Figure 1 Findings on how competence in domains of importance and social support combine to predict global self-esteem
self. Thus, in evaluating the self, one would adopt what one felt were the judgments of these others whose opinions were considered important. Thus, the approval, support, or positive regard from significant others became a critical source of one’s own sense of worth as a person. For example, children who receive approval from parents and peers will report much higher self-esteem than children who experience disapproval from parents and peers. Findings reveal that both of these factors, competence in domains of importance and the perceived support of significant others, combine to influence a child’s or adolescent’s self-esteem. Thus, as can be observed in Fig. 1, those who feel competent in domains of importance and who also report high support, rate themselves as having the highest selfesteem. Those who feel inadequate in domains deemed important and who also report low levels of support, rate themselves as having the lowest self-esteem. Other combinations fall in between (data from Harter 1993).
6. Conclusions Two types of self-representations that can be observed in children and adolescents were distinguished, evaluative judgments of competence or adequacy in specific domains and the global evaluation of one’s worth as a person, namely overall self-esteem. Each of these undergoes developmental change based on agerelated cognitive advances. In addition, older children and adolescents vary tremendously with regard to whether self-evaluations are positive or negative. Within a given individual, there will be a profile of selfevaluations, some of which are more positive and some more negative. More positive self-concepts in domains considered important, as well as approval
from significant others, will lead to high self-esteem. Conversely, negative self-concepts in domains considered important, coupled with lack of approval from significant others, will result in low self-esteem. Selfesteem is particularly important since it is associated with very important outcomes or consequences. Perhaps the most well-documented consequence of low self-esteem is depression. Children and adolescents (as well as adults) with the constellation of causes leading to low self-esteem will invariably report that they feel depressed, emotionally, and are hopeless about their futures; the most seriously depressed consider suicide. Thus, it is critical that we intervene for those experiencing low self-esteem. Our model of the causes of self-esteem suggests strategies that may be fruitful, for example, improving skills, helping individuals discount the importance of domains in which it is unlikely that they can improve, and providing support in the form of approval for who they are as people. Future research, however, is necessary to determine the different pathways to low and high self-esteem. For example, for one child, the sense of inadequacy in particular domains may be the pathway to low selfesteem. For another child, lack of support from parents or peers may represent the primary cause. Future efforts should be directed to the identification of these different pathways since they have critical implications for intervention efforts to enhance feelings of worth for those children with low self-esteem (see also Self-concepts: Educational Aspects). Positive self-esteem is clearly a psychological commodity, a resource that is important for us to foster in our children and adolescents if we want them to lead productive and happy lives. See also: Identity in Childhood and Adolescence; Personality Development in Childhood; Self-concepts: Educational Aspects; Self: History of the Concept; Self-knowledge: Philosophical Aspects; Self-regulation in Childhood
Bibliography Cooley C H 1902 Human Nature and the Social Order. Charles Scribner’s Sons, New York Damon W, Hart D 1988 Self-understanding in Childhood and Adolescence. Cambridge University Press, New York Fischer K W 1980 A theory of cognitive development: the control and construction of hierarchies of skills. Psychological Reiew 87: 477–531 Harter S 1993 Causes and consequences of low self-esteem in children and adolescents. In: Baumeister R F (ed.) Selfesteem: The Puzzle of Low Self-regard. Plenum, New York Harter S 1999 The Construction of the Self: A Deelopmental Perspectie. Guilford Press, New York James W 1892 Psychology: The Briefer Course. Henry Holt, New York Oosterwegel A, Oppenheimer L 1993 The Self-system: Deelopmental Changes Between and Within Self-concepts. Erlbaum, Hillsdale, NJ
13811
Self-deelopment in Childhood Piaget J 1960 The Psychology of Intelligence. Littlefield, Adams, Patterson, NJ Piaget J 1963 The Origins of Intelligence in Children. Norton, New York Rosenberg M 1979 Conceiing the Self. Basic Books, New York Selman R L 1980 The Growth of Interpersonal Understanding. Academic Press, New York
S. Harter
Self-efficacy Self-efficacy refers to the individual’s capacity to produce important effects. People who are aware of being able to make a difference feel good and therefore take initiatives; people who perceive themselves as helpless are unhappy and are not motivated for actions. This article treats the main concepts related to self-efficacy, their theoretical and historical contexts, their functions and practical uses, as well as developmental and educational\therapeutic aspects.
1. Concepts Everything that happens is caused to happen (Aristotle). Making changes means being a cause or providing a cause that produces a change. As the true causes are difficult to identify, the terms conditions and contingencies are often used instead. An effect is contingent upon a condition or a set of conditions if it always occurs if the condition or the set of conditions is met. Such conditions are sufficient but not necessary for producing the effect. Here we are interested in human actions as necessary conditions of change (see Motiation and Actions, Psychology of). Actions, too, depend on conditions. Considering person-related conditions of effective actions, we can differentiate aspects like knowledge, initiative, perseverance, intelligence, experience, physical force, help from others, and more. Thus, instead of saying that an actor is able to produce a certain effect, we can say more elaborately that an actor is endowed with certain means or conditions that enable him or her to attain certain goals (Fig. 1). We say that individuals (or groups) are in control of a specific goal if they are able to produce the corresponding changes (horizontal line of Fig. 1). More elaborately, they are in control of a specific goal if they are aware of the necessary contingencies and if they are competent enough to make these contingencies work (both diagonal lines in Fig. 1). Control is complete if these contingencies are necessary and sufficient; control is partial if the contingencies are necessary but not sufficient. Instead of control, Bandura (1977) introduced the word efficacy, more specifically self-efficacy. We use control and selfefficacy interchangeably. 13812
Figure 1 Means–ends relations and agency as components of control (adapted from Skinner et al. 1988 and Flammer 1990)
Controlling means putting control into action. This is not equivalent to having control or being in control: People are in control of certain states of affairs, if they can put control into action, even if they do not. For example, somebody has control over buying a new bicycle, even if he or she does not buy it. People can have control over certain events without knowing it; they will then probably miss possible chances to activate such control. On the other hand, people may believe that they have some control, but in fact do not. That may make them feel good as long as there is no need to put this control into work. Obviously, it is important that people do not only have control, but that they also know that they have control. Not being in control of an important situation is equivalent to being helpless in this respect (Seligman 1975). It has been proved that the psychological effects of helplessness (HL) are different depending on whether the helpless person believes themself to be helpless for ever (chronic HL), whether being helpless is unique (personal vs. universal HL), and whether helplessness is related to a specific domain or to most domains of life (specific vs. global HL). In the worst case, helpless people are (a) deeply sad about not having control, (b) demotivated to take initiatives or to invest effort and perseverance, (c) cognitively blind for any alternative or better view of the state of the world, and (d) devaluate themselves. Obviously, at least in subjectively important domains, we prefer self-efficacy to helplessness: selfefficacy beliefs provide us with security and pride. When we lack self-efficacy in important domains we either strive for self-efficacy (by fighting, learning, or training) or search for compensation. A common type of compensation consists of seeking help or delegating personal control (l indirect control or proxy control), e.g., to pay a gardener caring for one’s garden or to put a doctor in charge with one’s health or to pray to God for a favor in a seemingly hopeless situation. Another way of compensating lacking (primary) control is to use secondary control (Rothbaum et al. 1982). While control (i.e., primary control) consists of making the
Self-efficacy world fit with one’s goals and aspirations, secondary control accommodates personal aspirations or personal interpretations of the actual state in order to make them fit with the world (see Coping across the Lifespan).
2. History of the Concept of Self-efficacy In the 1950s, Rotter (1954) suggested the concept locus of control, meaning the place where control of desired reinforcement for behavior is exerted. Internal control means control within the person, external control means control outside the person, possibly in powerful others, in objective external conditions, or in chance or luck. Rotter and his associates have developed valid measuring instruments that have been used in thousands (!) of studies demonstrating that internal locus of control is positively correlated with almost all desirable attributes of humans. Fritz Heider (1944) who studied the subjective attributions for observed actions had already suggested the concepts of internality and externality. The true origin, i.e., the ‘cause’ of an observed action, is either attributed to the person (internal, personal liability) or to person-independent conditions (external, no personal liability). Heider’s work has triggered a large research tradition on causal attributions. Results of this research were of use for successful differentiations of subjective interpretations concerning experiences of helplessness (see above; see also Attributional Processes: Psychological). Consequently, attribution theory remains an important element of self-efficacy theory. Modern self-efficacy theory goes beyond Rotter’s theory insofar as it is more differentiated (e.g., contingency vs. competence, primary vs. secondary), distinctively referred to specific domains of actions (e.g., health, school), and elaborated to also include aspects other than personality (e.g., motivation, development).
3. Self-efficacy as an Important Element of a Happy and Successful Person Individuals with high self-efficacy beliefs also report strong feelings of well-being and high self-esteem in general (Bandura 1997, Flammer 1990). They are willing to take initiative in related domains, to apply effort if needed, and persevere in efforts as long as they believe in their efficacy. Potentially stressful situations produce less subjective stress in highly self-efficient individuals. However, while self-efficacy acts as a buffer against stress, it can also—indirectly—produce stress insofar as it can induce overly ambitious individuals to assume more responsibilities than they are able to cope with in sheer quantity. Moreover, self-efficacy has been reported to exert a positive influence on recovery from surgery or illness
and on healthy lifestyles. It is not surprising that high self-efficacy beliefs enhance school success; likewise, school failure inhibits relative self-efficacy beliefs, again partly depending on the individual’s attributional patterns. Interestingly, it has been demonstrated repeatedly and in several cultures that in most domains healthy and happy individuals tend to slightly overestimate themselves. Realistic estimation of selfefficacy is rather typical for persons vulnerable to depressed mood, and clear underestimation increases the chance for a clinical (reactive) depression. On the other hand, major overestimation might result in painful and harmful clashes with reality.
4. The Deelopment of Self-efficacy Beliefs Evidently, the newborn baby does not have selfefficacy beliefs in our sense. The basic structure of the self-efficacy beliefs develops within the first three or four years. According to Flammer’s (1995) analysis, the infant’s development towards the basic understanding of self-efficacy proceeds through a developmental sequence consisting of the acquisition of (a) the basic event schema (i.e., that classes of events happen), (b) the elementary causal schema (conditions, i.e., actions, events), (c) the understanding of personally producing effects, (d) the understanding of success and failure in aiming at nontrivial goals (visible as pride and as shame, respectively), and (e) the discovery of being not only the origin of one certain change but also capable of producing such changes. Obviously, this development proceeds in domains that are accessible by the infant so far. Later on, this development will have to be extended to further domains. As to the domain of school success, within the second half of the first decade of life, the child learns more and more differentiations of means towards the same ends. Thus, he or she gradually abandons a global concept of simply being or not being able and singles out—probably in this sequence—the factor effort (more effort is needed to solve tasks—a typical lesson to be learned early in school), the factors individual ability and task difficulty (higher difficulty requiring more ability), and finally the understanding of the compensatory relation between effort and ability (it is possible to reach the same goals by being less capable but more hard-working). In adolescence and early adulthood more lessons have to be learned. More and more domains become accessible to personal control due to increased cognitive, physical, or economic strength and social power. This is exciting, indeed. However, individuals have permanently to select from the choices which are offered to them (Flammer 1996). Trying to control everything results in overburdening. One thing is to deselect control domains because they compete with higher priority control domains; another thing is to be forced to renounce control because no accessible 13813
Self-efficacy contingencies seem to exist. As long as there are enough attractive alternatives available, it is not painful, but it can severely hurt handicapped individuals and old people when they lose control of important domains. Old people are well advised both not to resign too early and to search for compensations. Such compensations consists of artifacts of all kinds (from memory aids to hearing aids), but they also include the above mentioned compensations like indirect control (social resources) and secondary control. Indeed, it seems that the extent and the importance of secondary control increases with the lifetime (Heckhausen and Schulz 1995). Under certain conditions, Baltes and Silverberg (1994) have even suggested that people in old people’s homes adjust better if in certain domains they give up personal control at all. Alluding to the concept of learned helplessness, they called such behavior learned dependency. Learned dependency helps to avoid certain social conflicts; the only remaining personal control may be the control of giving in.
5. Educational and Therapeutic Aspects According to the development of the basic structure of self-efficacy, contingent behavior by the caregivers is crucial already within the first weeks of life. Caregivers’ behavior should be predictable, i.e., contingent at least upon the baby’s actual behavior, and as far as ever possible upon the baby’s perceptions, feelings, and intentions. This requires an enormous potential of sensitivity towards the child. Fortunately, researchers have demonstrated that parental empathy is partly a natural gift with the majority of attentive parents. Studies have shown that contingent behavior fosters children’s happiness, but also their willingness to learn and their curiosity. If caregivers are judged as not reacting contingently enough, we also have to consider that some babies show quite unorganized behavior and make the caregiver’s task very difficult (‘difficult babies’). In such cases it is difficult to decide whether the noncontingency has originated from the caregiver or from the baby. The subsequent steps in the development of selfefficacy require that caregivers provide freedom for experimentation, let the child try by himself or herself, and comment on the successes and failures in a way that the child can establish and maintain confidence in his or her efficacy (Schneewind 1995). Nevertheless, caregivers should try to prevent the child from dangerous and frequent hopeless experiences. Psychotherapy with individuals who have severely undermined self-efficacy beliefs is difficult. Teaching and trying to convince them that they are really capable even when they believe not to be does not help much. Helping them to recall from memory prior success experiences instead of being impressed by failures only is more efficient. Even more efficient are new and successful experiences. Helpless individuals not only 13814
interpret failures to their disadvantage, they also play down their contribution to eventual success. Parallel to these findings, memory research has demonstrated that depressed people’s memories of own actions are biased towards recalling more of their failures than their own success. This leaves us with an important contrast: while children and healthy adults tend to overestimate their self-efficacy, individuals who have lost confidence in themselves immunize such devastating beliefs by not trying anymore, by self-damaging attributions, and by recalling their biography in a way that is consistent with their beliefs. Given the pervasive influence of positive beliefs in self-efficacy, it is important to help individuals with establishing and maintaining selfefficacy beliefs at a high level, and to guide failureexpecting persons to positive experiences.
6. Conclusion Within the last decades, theory and research have established self-efficacy beliefs as important elements in the understanding of human action and human well-being in a very large sense. However, little is known so far about differences in self-efficacy beliefs in different life domains and among different cultures. Further research should include more systematic comparisons between cultures, between life domains, and—if possible—between historical times. In addition, it is suggested that in the future investigators consider more seriously the fact that all changes are due to a multitude of necessary conditions. More specifically, there is a need for researchers to consider the efficacy and efficacy-beliefs of interacting people, that is, to examine concepts such as shared control or ‘common efficacy.’ See also: Control Behavior: Psychological Perspectives; Learned Helplessness; Motivation and Actions, Psychology of; Self-efficacy and Health; Selfefficacy: Educational Aspects; Self-regulation in Adulthood; Self-regulation in Childhood
Bibliography Baltes M M, Silverberg S B 1994 The dynamics between dependency and autonomy. In: Featherman D L, Lerner R M, Parlmutter M (eds.) Life-span Deelopment and Behaior. Erlbaum, New York Bandura A 1977 Self-efficacy: Toward a unifying theory of behavioral change. Psychological Reiew 84: 191–215 Bandura A 1997 The Exercise of Control. Freeman, New York Flammer A 1990 Erfahrung der eigenen Wirksamkeit [Experiencing One’s Own Efficacy]. Huber, Bern, Switzerland Flammer A 1995 Developmental analysis of control beliefs. In: Bandura A (ed.) Self-efficacy in Changing Societies. Cambridge University Press, New York Flammer A 1996 Entwicklungstheorien [Theories of Deelopment]. Huber, Bern, Switzerland
Self-efficacy and Health Heckhausen J, Schultz R 1995 The life-span theory of control. Psychological Reiew 102: 284–304 Heider F 1944 Social perception and phenomenal causality. Psychological Reiew 51: 358–74 Rothbaum F, Weisz J R, Snyder S S 1982 Changing the world and changing the self: a two-process model of perceived control. Journal of Personality and Social Psychology 42: 5–37 Rotter J B 1954 Social Learning and Clinical Psychology. Prentice-Hall, Englewood Cliffs, NJ Schneewind K A 1995 Impact of family processes on control beliefs. In: Bandura A (ed.) Self-efficacy in Changing Societies. Cambridge University Press, New York Seligman M E P 1975 Helplessness. On Depression, Deelopment and Death. Freeman, San Francisco Skinner E A, Chapman M, Baltes P B 1988 Control, means– ends, and agency beliefs: A new conceptualization and its measurement during childhood. Journal of Personality and Social Psychology 54: 117–33
A. Flammer
Self-efficacy and Health The quality of human health is heavily influenced by lifestyle habits. By exercising control over several health habits people can live longer healthier and slow the process of aging (see Control Beliefs: Health Perspecties). Exercise, reduce dietary fat, refrain from smoking, keep blood pressure down, and develop effective ways of coping with stressors. If the huge health benefits of these few lifestyle habits were put into a pill it would be declared a spectacular breakthrough in the field of medicine. The recent years have witnessed a major change in the conception of human health and illness from a disease model to a health model. It is just as meaningful to speak of levels of vitality as of degrees of impairment. The health model, therefore, focuses on health promotion as well as disease prevention. Perceived self-efficacy plays a key role in the self-management of habits that enhance health and those that impair it.
1. Perceied Self-efficacy Perceived self-efficacy is concerned with people’s beliefs in their capabilities to exercise control over their own functioning and over environmental events. Such beliefs influence what courses of action people choose to pursue, the goals they set for themselves and their commitment to them, how much effort they put forth in given endeavors, how long they persevere in the face of obstacles and failure experiences, their resilience to adversity, whether their thought patterns are self-hindering or self-aiding, how much stress and depression they experience in coping with taxing
environmental demands, and the level of accomplishments they realize (Bandura 1997, Schwarzer 1992). In social cognitive theory, perceived self-efficacy operates in concert with other determinants in regulating lifestyle habits. They include the positive and negative outcomes people expect their actions to produce. These outcome expectations may take the form of aversive and pleasurable physical effects, approving and disapproving social reactions, or selfevaluative consequences expressed as self-satisfaction and self-censure. Personal goals, rooted in a value system, provide further self-incentives and guides for health habits. The perceived sociostructural facilitators and impediments operate as another set of determinants of health habits. Self-efficacy is a key determinant in the causal structure because it affects health behavior both directly, and by its influence on these other determinants. The stronger the perceived efficacy, the higher the goal challenges people set for themselves, the more they expect their efforts to produce desired outcomes, and the more they view obstacles and impediments to personal change as surmountable. There are two major ways which a sense of personal efficacy affects human health. At the more basic level, such beliefs activate biological systems that mediate health and disease. The second level is concerned with the exercise of direct control over habits that affect health and the rate of biological aging.
2. Impact of Efficacy Beliefs on Biological Systems Stress is an important contributor to many physical dysfunctions (O’Leary 1990). Perceived controllability appears to be the key organizing principle in explaining the biological effects of stress. Exposure to stressors with the ability to exercise some control over them has no adverse physical effects. But exposure to the same stressors without the ability to control them impairs immune function (Herbert and Cohen 1993b, Maier et al. 1985). Epidemiological and correlational studies indicate that lack of behavioral or perceived control over stressors increases susceptibility to bacterial and viral infections, contributes to the development of physical disorders and accelerates the rate of progression of disease (Schneiderman et al. 1992). In social cognitive theory, stress reactions arise from perceived inefficacy to exercise control over aversive threats and taxing environmental demands (Bandura 1986). If people believe they can deal effectively with potential stressors, they are not perturbed by them. But, if they believe they cannot control aversive events, they distress themselves and impair their level of functioning. Perceived inefficacy to manage stressors activates autonomic, catecholamine and opioid systems that modulate the immune system in ways that 13815
Self-efficacy and Health can increase susceptibility to illness (Bandura 1997, O’Leary 1990). The immunosuppressive effects of stressors is not the whole story, however. People are repeatedly bombarded with taxing demands and stressors in their daily lives. If stressors only impaired immune function people would be highly vulnerable to infective agents that would leave them chronically bedridden with illnesses or quickly do them in. Most human stress is activated while competencies are being developed and expanded. Stress aroused while gaining a sense of mastery over aversive events strengthens components of the immune system (Wiedenfeld et al. 1990). The more rapid the growth of perceived coping efficacy, the greater the boost of the immune system. Immunoenhancement during development of coping capabilities vital to effective adaptation has evolutionary survival value. The field of health functioning has been heavily preoccupied with the physiologically debilitating effects of stressors. Self-efficacy theory also acknowledges the physiologically strengthening effects of mastery over stressors. As Dienstbier (1989) has shown, a growing number of studies is providing empirical support for physiological toughening by successful coping. Depression is another affective pathway through which perceived coping efficacy can affect health functioning. It has been shown to reduce immune function, and to heighten susceptibility to disease (Herbert and Cohen 1993a). The more severe the depression, the greater the reduction in immunity. Perceived self-efficacy to exercise control over things one values highly produces bouts of depression (Bandura 1997). Social support reduces vulnerability to stress, depression, and physical illness. But social support is not a self-forming entity waiting around to buffer harried people against stressors. People have to go out and find, create, and maintain supportive relationships for themselves. This requires a robust sense of social efficacy. Perceived social inefficacy contributes to depression both directly, and by curtailing development of social supports (Holahan and Holahan 1987). Social support, in turn, enhances perceived selfefficacy. Mediational analyses show that social support alleviates depression and fosters health-promoting behavior only to the extent that it boosts personal efficacy.
3. Self-efficacy in Promoting Healthful Lifestyles Lifestyle habits can enhance or impair health (see Health Behaiors). This enables people to exert some behavioral control over their vitality and quality of health. Efficacy beliefs affect every phase of personal change: whether people even consider changing their 13816
health habits; whether they enlist the motivation and perseverance needed to succeed should they choose to do so; and how well they maintain the habit changes they have achieved (Bandura 1997).
3.1 Initiation of Change People’s beliefs that they can motivate themselves and regulate their own behavior play a crucial role in whether they even consider changing detrimental health habits. They see little point in even trying if they believe they do not have what it takes to succeed. If they make an attempt, they give up easily in the absence of quick results or setbacks. Among those who change detrimental health habits on their own, the successful ones have stronger perceived selfefficacy at the outset than nonchangers and subsequent relapsers. Efforts to get people to adopt healthful practices rely heavily on persuasive communications in health education campaigns. Health communications foster adoption of healthful practices mainly by raising beliefs in personal efficacy, rather than by transmitting information on how habits affect health, by arousing fear of disease, or by increasing perception of one’s personal vulnerability or risk (Meyerowitz and Chaiken 1987). To help people reduce health-impairing habits requires a change in emphasis, from trying to scare people into health, to enable them with the skills and self-beliefs needed to exercise control over their health habits. In community-wide health campaigns, people’s pre-existing beliefs that they can exercise control over their health habits, and the efficacy beliefs enhanced by the campaign, both contribute to health-promoting habits (Maibach et al. 1991).
3.2 Adoption of Change Effective self-regulation of health behavior is not achieved through an act of will. It requires development of self-regulatory skills. To build people’s sense of efficacy, they must develop skills on how to influence their own motivation and behavior. In such programs, they learn how to monitor their health behavior and the social and cognitive conditions under which they engage in it; set attainable subgoals to motivate and guide their efforts; draw from an array of coping strategies rather than rely on a single technique; enlist self-motivating incentives and social supports to sustain the effort needed to succeed; and apply multiple self-influence consistently and persistently (Bandura 1997, Perri 1985). Once equipped with skills and belief in their self-regulatory capabilities, people are better able to adopt behaviors that promote health, and to eliminate those that impair it. A large body of evidence
Self-efficacy and Health reveals that the self-efficacy belief system operates as a common mechanism through which psychosocial treatments affect different types of health outcomes (Bandura 1997, Holden 1991).
3.3 Maintenance of Change It is one thing to get people to adopt beneficial health habits. It is another thing to get them to adhere to them. Maintenance of habit change relies heavily on self-regulatory capabilities and the functional value of the behavior. Development of self-regulatory capabilities requires instilling a resilient sense of efficacy as well as imparting skills. Experiences in exercising control over troublesome situations serve as efficacy builders. Efficacy affirmation trials are an important aspect of self-management because, if people are not fully convinced of their personal efficacy, they rapidly abandon the skills they have been taught when they fail to get quick results or suffer reverses. Like any other activity, self-management of health habits includes improvement, setbacks, plateaus, and recoveries. Studies of refractory detrimental habits show that a low sense of efficacy increases vulnerability to relapse (Bandura 1997, Marlatt et al. 1995). To strengthen resilience, people need to develop coping strategies not only to manage common precipitants of breakdown, but to reinstate control after setbacks. This involves training in how to manage failure (see Health: Self-regulation).
4. Self-management Health Systems Healthcare expenditures are soaring at a rapid rate. With people living longer and the need for healthcare services rising with age, societies are confronted with major challenges on how to keep people healthy throughout their lifespan, otherwise they will be swamped with burgeoning health costs. Health systems generally focus heavily on the supply side with the aim of reducing, rationing, and curtailing access to health to contain health costs. The social cognitive approach works on the demand side by helping people to stay healthy through good self-management of health habits. This requires intensifying health promotion efforts and restructuring health delivery systems to make them more productive. Efficacy-based models have been devised combining knowledge of self-regulation of health habits with computer-assisted implementation that provides effective health-promoting services in ways that are individualized, intensive and highly convenient (DeBusk et al. 1994). In this type of self-management system, people monitor their health habits. They set short-term goals for themselves and receive periodic feedback of progress towards their goals along with guides on how to manage troublesome situations.
Efficacy ratings identify areas in which self-regulatory skills must be developed and strengthened if beneficial changes are to be achieved and maintained. The productivity of the system is vastly expanded by combining self-regulatory principles with the power of computer-assisted implementation. A single implementer, assisted with a computerized coordinating and mailing system, provides intensive individualized training in self-management for large numbers of people simultaneously. The self-management system reduces health risk factors, improves health status, and enhances the quality of life in cost-effective ways (Bandura 1997). The self-management system is well received by participants because it is individually tailored to their needs, provides continuing personalized guidance and informative feedback that enables them to exercise considerable control over their own change; is a homebased program that does not require any special facilities, equipment, or attendance at group meetings that usually have high dropout rates; serves large numbers of people simultaneously under the guidance of a single implementer; is not constrained by time and place; and provides valuable health-promotion services at low cost. By combining the high individualization of the clinical approach with the large-scale applicability of the public health approach, the selfmanagement system includes the features that ensure high social utility. Linking the interactive aspects of the self-management model to the Internet can vastly expand its availability for preventive and promotive guidance. Chronic disease has become the dominant form of illness and the major cause of disability. The treatment of chronic disease must focus on self-management of physical conditions over time rather than on cure. This requires, among other things, pain amelioration, enhancement and maintenance of functioning with growing physical disability and development of selfregulative compensatory skills. Holman and Lorig (1992) have devised a prototypical model for the selfmanagement of different types of chronic diseases. Patients are taught cognitive and behavioral pain control techniques; proximal goal setting combined with self incentives as motivators to increase levels of activity; problem solving and self-diagnostic skills; and the ability to locate community resources and to manage medication programs. How healthcare systems deal with clients can alter their efficacy in ways that support or undermine their restorative efforts. Clients are, therefore, taught how to take greater initiative for their healthcare in dealings with health personnel. These skills are developed through modeling of self-management skills, guided mastery practice, and enabling feedback. The self-management program retards the biological progression of the disease, raises perceived selfregulatory efficacy, reduces pain and distress, fosters better cognitive symptom management, lessens the 13817
Self-efficacy and Health impairment of role functions, improves the quality of life, and decreases the use of medical services. Both the perceived self-efficacy at the outset, and the efficacy beliefs instilled by the self-management program predict later health status and functioning (Holman and Lorig 1992).
5. Childhood Health Promotion Models Many of the lifelong habits that jeopardize health are formed during childhood and adolescence (see Childhood Health). For example, unless youngsters take up the smoking habit as teenagers they rarely become smokers in adulthood. It is easier to prevent detrimental health habits than to try to change them after they have become deeply entrenched as part of a lifestyle. The biopsychosocial model provides a valuable public health tool for societal efforts to promote the health of its youth. Health habits are rooted in familial practices. But, schools have a vital role to play in promoting the health of a nation. This is the only place where all children can be easily reached so it provides a natural setting for promoting healthful habits and building self-management skills. Effective health promotion models include several major components. The first component is informational. It informs people of the health risks and benefits of different lifestyle habits. The second component develops the social and self-regulative skills for translating informed concerns into effective preventive action. As noted earlier, this includes self-monitoring of health practices, goal setting, and enlistment of selfincentives for personal change. The third component builds a resilient sense of self-regulatory efficacy to support the exercise of control in the face of difficulties that inevitably arise. Personal change occurs within a network of social influences. Depending on their nature, social influences can aid, retard, or undermine efforts at personal change. The final component, therefore, enlists and creates social supports for desired changes in health habits (see Social Support and Health). Educational efforts to promote the health of youth usually produce weak results. They are heavy on didactics but meager on personal enablement. They provide factual information about health, but do little to equip children with the skills and self-beliefs that enable them to manage the emotional, and social pressures to adopt detrimental health habits. Managing health habits involves managing emotional states and diverse social pressures for unhealthy behavior not just targeting a specific health behavior for change. Health promotion programs that encompass the essential elements of the self-regulatory model prevent or reduce injurious health habits. Health knowledge can be conveyed readily, but changes in values, 13818
attitudes, and health habits require greater effort. The more behavioral mastery experiences provided in the form of role enactments, the greater the beneficial changes (Bruvold 1993). The more intensive the program and the better the implementation, the stronger the impact (Connell et al. 1985). Comprehensive approaches that integrate guided mastery health programs with family and community efforts are more successful in promoting health and preventing adoption of detrimental health habits, than are programs in which the schools try to do it alone (Perry et al. 1992).
6. Efficacy Beliefs in Prognostic Judgment and Health Outcomes Much of the work in the health field is concerned with diagnosing maladies, forecasting the likely course of different physical disorders and prescribing appropriate remedies. Medical prognostic judgments involve probabilistic inferences from knowledge of varying quality and inclusiveness about the multiple factors governing the course of a given disorder. One important issue regarding medical prognosis concerns the scope of determinants included in a prognostic model. Because psychosocial factors account for some of the variability in the course of health functioning, inclusion of self-efficacy determinants in prognostic models enhances their predictive power (Bandura 2000). Recovery from medical conditions is partly governed by social factors. Recovery from a heart attack provides one example. About half the patients who experience heart attacks have uncomplicated ones. Their heart heals rapidly, and they are physically capable of resuming an active life. But the psychological and physical recovery is slow for those patients who believe they have an impaired heart. The recovery task is to convince patients that they have a sufficiently robust heart to resume productive lives. Spouses’ judgments of patients’ physical and cardiac capabilities can aid or retard the recovery process (see Social Support and Recoery from Disease and Medical Procedures). Programs that raise and strengthen spouses’ and patients’ beliefs in their cardiac capabilities enhance recovery of cardiovascular capacity (Taylor et al. 1985). The couple’s joint belief in the patients’ cardiac efficacy is the best predictor of improvement in cardiac functioning. Those who believe that their partners have a robust heart are more likely to encourage them to resume an active life than those who believe their partner’s heart is impaired and vulnerable to further damage. Pursuit of an active life strengthens the cardiovascular system. Prognostic judgments are not simply inert forecasts of a natural course of a disease. Prognostic expectations can affect patients’ beliefs in their physical efficacy. Therefore, diagnosticians not only foretell,
Self-efficacy and Health but may partly influence the course of recovery from disease. Prognostic expectations are conveyed to patients by attitude, word, and the type and level of care provided them. People are more likely to be treated in enabling ways under positive than under negative expectations. Differential care that promotes in patients different levels of personal efficacy and skill in managing health-related behavior can exert stronger impact on the trajectories of health functioning than simply conveying prognostic information. Prognostic judgments have a self-confirming potential. Expectations can alter patients’ sense of efficacy and behavior in ways that confirm the original expectations. The self-efficacy mechanism operates as one important mediator of self-confirming effects.
7. Socially Oriented Approaches to Health The quality of health of a nation is a social matter not just a personal one. It requires changing the practices of social systems that impair health rather than just changing the habits of individuals. Vast sums of money are spent annually in advertising and marketing products and promoting lifestyles detrimental to health. With regard to injurious environmental conditions, some industrial and agricultural practices inject carcinogens and harmful pollutants into the air we breathe, the food we eat, and the water we drink, all of which take a heavy toll on health. Vigorous economic and political battles are fought over environmental health and where to set the limits of acceptable risk. We do not lack sound policy prescriptions in the field of health. What is lacking is the collective efficacy to realize them. People’s beliefs in their collective efficacy to accomplish social change by perseverant group action play a key role in the policy and public health approach to health promotion and disease prevention (Bandura 1997, Wallack et al. 1993). Such social efforts take a variety of forms. They raise public awareness of health hazards, educate and influence policymakers, devise effective strategies for improving health conditions, and mobilize public support to enact policy initiatives. While concerted efforts are made to change sociostructural practices, people need to improve their current life circumstances over which they command some control. Psychosocial models that work best in improving health and preventing disease promote community self-help through collective enablement (McAlister et al. 1991). Given that health is heavily influenced by behavioral, environmental, and economic factors, health promotion requires greater emphasis on the development and enlistment of collective efficacy for socially oriented initiatives. See also: Control Beliefs: Health Perspectives; Health Behavior: Psychosocial Theories; Health Behaviors;
Health Education and Health Promotion; Health: Self-regulation; Self-efficacy; Self-efficacy: Educational Aspects
Bibliography Bandura A 1986 Social Foundations of Thought and Action: A Social Cognitie Theory. Prentice-Hall, Englewood Cliffs, NJ Bandura A 1997 Self-efficacy: The Exercise of Control. W H Freeman, New York Bandura A 2000 Psychological aspects of prognostic judgments. In: Evans R W, Baskin D S, Yatsu F M (eds.) Prognosis of Neurological Disorders, 2nd edn. Oxford University Press, New York, pp. 11–27 Bruvold W H 1993 A meta-analysis of adolescent smoking prevention programs. American Journal of Public Health 83: 872–80 Connell D B, Turner R R, Mason E F 1985 Summary of findings of the school health education evaluation: Health promotion effectiveness implementation and costs. Journal of School Health 55: 316–21 DeBusk R F et al. 1994 A case-management system for coronary risk factor modification after acute myocardial infarction. Annals Of Internal Medicine 120: 721–9 Dienstbier R A 1989 Arousal and physiological toughness: Implications for mental and physical health. Psychological Reiew 96: 84–100 Herbert T B, Cohen S 1993a Depression and immunity: A metaanalytic review. Psychological Bulletin 113: 472–86 Herbert T B, Cohen S 1993b Stress and immunity in humans: a meta-analytic review. Psychosomatic Medicine 55: 364–79 Holahan C K, Holahan C J 1987 Self-efficacy, social support, and depression in aging: a longitudinal analysis. Journal of Gerontology 42: 65–8 Holden G 1991 The relationship of self–efficacy appraisals to subsequent health related outcomes: A meta–analysis. Social Work in Health Care 16: 53–93 Holman H, Lorig K 1992 Perceived self-efficacy in selfmanagement of chronic disease. In: Schwarzer R (ed.) SelfEfficacy: Thought Control of Action. Hemisphere Pub. Corp, Washington, DC, pp. 305–23 Maibach E, Flora J, Nass C 1991 Changes in self-efficacy and health behavior in response to a minimal contact community health campaign. Health Communication 3: 1–15 Maier S F, Laudenslager M L, Ryan S M 1985 Stressor controllability, immune function, and endogenous opiates. In: Brush F R, Overmier J B (eds.) Affect, Conditioning, and Cognition: Essays on the Determinants of Behaior. Lawrence Erlbaum Associates, Hillsdale, NJ, pp. 183–201 Marlatt G A, Baer J S, Quigley L A 1995 Self-efficacy and addictive behavior. In: Bandura A (ed.) Self-efficacy in Changing Societies. Cambridge University Press, New York, pp. 289–315 McAlister A L, Puska P, Orlandi M, Bye L L, Zbylot P 1991 Behaviour modification: Principles and illustrations. In: Holland W W, Detels R, Knox G (eds.) Oxford Textbook of Public Health: Applications in Public Health, 2nd edn. Oxford University Press, Oxford, UK, pp. 3–16 Meyerowitz B E, Chaiken S 1987 The effect of message framing on breast self-examination attitudes intentions and behavior. Journal of Personality and Social Psychology 52: 500–10
13819
Self-efficacy and Health O’Leary A 1990 Stress, emotion, and human immune function. Psychological Bulletin 108: 363–82 Perri M G 1985 Self-change strategies for the control of smoking, obesity, and problem drinking. In: Shiffman S, Wills T A (eds.) Coping and Substance Use. Academic Press, Orland pp. 295–317 Perry C L, Kelder S H, Murray D M, Klepp K 1992 Community-wide smoking prevention: Long-term outcomes of the Minnesota heart health program and the class of 1989 study. American Journal of Public Health 82: 1210–6 Schneiderman N, McCabe P M, Baum A (eds.) 1992 Stress and Disease Processes: Perspecties in Behaioral Medicine. Erlbaum, Hillsdale, NJ Schwarzer R (ed.) 1992 Self-efficacy: Thought Control of Action. Hemisphere Pub. Corp, Washington, DC Taylor C B, Bandura A, Ewart C K, Miller N H, Debusk R F 1985 Exercise testing to enhance wives’ confidence in their husbands’ cardiac capabilities soon after clinically uncomplicated acute myocardial infarction. American journal of Cardiology 55: 635–8 Wallack L, Dorfman L, Jernigan D, Themba M 1993 Media Adocacy and Public Health: Power for Preention. Sage, Newbury Park, CA Wiedenfeld S A, Bandura A, Levine S, O’Leary A, Brown S, Raska K 1990 Impact of perceived self-efficacy in coping with stressors on components of the immune system. Journal of Personality and Social Psychology 59: 1082–94
A. Bandura
Self-efficacy: Educational Aspects Current theoretical accounts of learning and instruction postulate that students are active seekers and processors of information (Pintrich et al. 1986). Research indicates that students’ cognitions influence the instigation, direction, strength, and persistence of achievement behaviors (Schunk 1995). This article reviews the role of one type of personal cognition: self-efficacy, or one’s perceived capabilities for learning or performing behaviors at designated levels (Bandura 1997). The role of self-efficacy in educational contexts is discussed to include the cues that students use to appraise their self-efficacy. A model of the operation of self-efficacy is explained, along with some key findings from educational research. The entry concludes by describing the role of teacher efficacy and suggesting future research directions.
1. Self-efficacy Theory Self-efficacy can affect choice of activities, effort, persistence, and achievement (Bandura 1997, Schunk 1991). Compared with students who doubt their learning capabilities those with high self-efficacy for accomplishing a task participate more readily, work 13820
harder, persist longer when they encounter difficulties, and demonstrate higher achievement. Learners acquire information to appraise selfefficacy from their performance accomplishments, vicarious (observational) experiences, forms of persuasion, and physiological reactions. Students’ own performances offer reliable guides for assessing efficacy. Successes raise self-efficacy and failures lower it, but once a strong sense of self-efficacy is developed a failure may not have much impact (Bandura 1986). Learners also acquire self-efficacy information from knowledge of others through classroom social comparisons. Similar others offer the best basis for comparison. Students who observe similar peers perform a task are apt to believe that they, too, are capable of accomplishing it. Information acquired vicariously typically has a weaker effect on self-efficacy than performance-based information because the former can be negated easily by subsequent failures. Students often receive persuasive information from teachers and parents that they are capable of performing a task (e.g., ‘You can do this’). Positive feedback enhances self-efficacy, but this increase will be temporary if subsequent efforts turn out poorly. Students also acquire efficacy information from physiological reactions (e.g., heart rate, sweating). Symptoms signaling anxiety might be interpreted to mean that one lacks skills. Information acquired from these sources does not automatically influence self-efficacy; rather, it is cognitive appraised (Bandura 1986). In appraising efficacy, learners weigh and combine perceptions of their ability, the difficulty of the task, the amount of effort expended, the amount of external assistance received, the number and pattern of successes and failures, similarity to models, and credibility of persuaders (Schunk 1991). Self-efficacy is not the only influence in educational settings. Achievement behavior also depends on knowledge and skills, outcome expectations, and the perceived value of outcomes (Schunk 1991). High selfefficacy does not produce competent performances when requisite knowledge and skills are lacking. Outcome expectations, or beliefs concerning the probable outcomes of actions, are important because students strive for positive outcomes. Perceived value of outcomes refers to how much learners desire certain outcomes relative to others. Learners are motivated to act in ways that they believe will result in outcomes they value. Self-efficacy is dynamic and changes as learning occurs. The hypothesized process whereby self-efficacy operates during learning is as follows (Schunk 1996). Students enter learning situations with varying degrees of self-efficacy for learning. They also have goals in mind, such as learning the material, working quickly, pleasing the teacher, and making a high grade. As they engage in the task, they receive cues about how well they are performing, and they use these cues to assess
Self-efficacy: Educational Aspects their learning progress and their self-efficacy for continued learning. Perceived progress sustains motivation and leads to continued learning. Perceptions of little progress do not necessarily diminish selfefficacy if learners believe they know how to perform better, such as by working harder, seeking help, or switching to a more effective strategy (Schunk 1996).
2. Factors Affecting Self-efficacy There are many instructional, social, and environmental factors that operate during learning. Several of these factors have been investigated to determine how they influence learners’ self-efficacy. For example, research has explored the roles of goal setting, social modeling, rewards, attributional feedback, social comparisons, progress monitoring, opportunities for self-evaluation of progress, progress feedback, and strategy instruction (Schunk 1995). As originally conceptualized by Bandura, selfefficacy is a domain-specific construct. Self-efficacy research in education has tended to follow this guidance and assess students’ self-efficacy within domains at the level of individual tasks. In mathematics, for example, students may be shown sample multiplication problems and for each sample judge their confidence for solving similar problems correctly. Efficacy scales typically are numerical and range from low to high confidence. After completing the efficacy assessment students are presented with actual problems to solve. These achievement test problems corresponding closely to those on the self-efficacy test, although they are not identical. Such specificity allows researchers to relate self-efficacy to achievement to determine correspondence and prediction (Pajares 1996). Other measures often collected by self-efficacy researchers include persistence, motivation, and selfregulation strategies. Following a pretest, students receive instruction complemented by one or more of the preceding educational variables. After the instruction is completed students receive a post-test; in some studies follow-up maintenance testing is done. A general finding from much educational selfefficacy research is that educational variables influence self-efficacy to the extent that they convey to learners information about their progress in learning (Schunk 1995). For example, much research shows that specific proximal goals raise self-efficacy, motivation, and achievement better than do general goals (Schunk 1995). Short-term specific goals provide a clear standard against which to compare learning progress. As learners determine that they are making progress, this enhances their self-efficacy for continued learning. In contrast, assessing progress against a general goal (e.g., ‘Do your best’) is difficult; thus, learners receive less clear information about progress, and self-efficacy is not strengthened as well.
3. Predictie Utility of Self-efficacy Self-efficacy research has examined the relation of selfefficacy to such educational outcomes as motivation, persistence, and achievement (Pajares 1996). Significant and positive correlations have been obtained across many studies between self-efficacy assessed prior to instruction and subsequent motivation during instruction. Initial judgments of self-efficacy have been found also to correlate positively and significantly with post-test measures of self-efficacy and achievement collected following instruction. Multiple regression has been used to determine the percentage of variability in skillful performance accounted for by self-efficacy. Schunk and Swartz (1993) found that post-test self-efficacy was the strongest predictor of children’s paragraph writing skills. Shell et al. (1989) found that although self-efficacy and outcome expectations predicted reading and writing achievement, self-efficacy was the strongest predictor. Several studies have tested causal models. Schunk (1981) employed path analysis to reproduce the correlation matrix comprising long-division instructional method, self-efficacy, persistence, and achievement. The best model showed a direct effect of method on achievement and an indirect effect through persistence and self-efficacy, an indirect effect of method on persistence through self-efficacy, and a direct effect of self-efficacy on persistence and achievement. Schunk and Gunn (1986) found that the largest direct influence on achievement was due to use of effective learning strategies; achievement also was heavily influenced by self-efficacy.
4. Teacher Self-efficacy Self-efficacy is applicable to teachers as well as students. Ashton and Webb (1986) postulated that self-efficacy should influence teachers’ activities, efforts, and persistence. Teachers with low self-efficacy may avoid planning activities that they believe exceed their capabilities, may not persist with students having difficulties, may expend little effort to find materials, and may not reteach content in ways students might better understand. Teachers with higher self-efficacy might develop challenging activities, help students succeed, and persevere with students who have trouble learning. These motivational effects enhance student learning and substantiate teachers’ self-efficacy by suggesting that they can help students learn. Correlational data show that self-efficacy is related to teaching behavior. Ashton and Webb (1986) found that teachers with higher self-efficacy were likely to have a positive classroom environment (e.g., less student anxiety and teacher criticism), support students’ ideas, and meet the needs of all students. High teacher self-efficacy was positively associated with use of praise, individual attention to students, checking on 13821
Self-efficacy: Educational Aspects students’ progress in learning, and their mathematical and language achievement. Tschannen-Moran et al. (1998) discuss teacher self-efficacy in greater depth.
5. Future Research Directions The proliferation of self-efficacy research in education has enlightened understanding of the construct but also has resulted in a multitude of measures. In developing efficacy assessments it is imperative that researchers attempt to be faithful to Bandura’s (1986) conceptualization of self-efficacy as a domain-specific measure. Research will benefit from researchers publishing their instruments along with validation data to include reliability and validity. As self-efficacy research continues in settings where learning occurs, it will be necessary to collect longitudinal data showing how self-efficacy changes over time as a consequence of learning. This focus will require broadening of self-efficacy assessments from reliance on numerical scales to qualitative data. Researchers also should relate measures of teacher self-efficacy to those of student self-efficacy to test the idea that these variables reciprocally influence one another. Finally, research is needed on the role of selfefficacy during self-regulation. Self-regulation refers to self-generated thoughts and actions that are systematically oriented toward attainment of one’s learning goals (Zimmerman 1990). Self-efficacy has the potential to influence many aspects of self-regulation, yet to date only a few areas have been explored in research. This focus will become more critical as selfregulation assumes an increasingly important role in education.
Schunk D H 1996 Goal and self-evaluative influences during children’s cognitive skill learning. American Educational Research Journal 33: 359–82 Schunk D H, Gunn T P 1986 Self-efficacy and skill development: Influence of task strategies and attributions. Journal of Educational Research 79: 238–44 Schunk D H, Swartz C W 1993 Goals and progress feedback: Effects on self-efficacy and writing achievement. Contemporary Educational Psychology 18: 337–54 Shell D F, Murphy C C, Bruning R H 1989 Self-efficacy and outcome expectancy mechanisms in reading and writing achievement. Journal of Educational Psychology 81: 91–100 Tschannen-Moran M, Hoy A W, Hoy W K 1998 Teacher efficacy: Its meaning and measure. Reiew of Educational Research 68: 202–48 Zimmerman B J 1990 Self-regulating academic learning and achievement: The emergence of a social cognitive perspective. Educational Psychology Reiew 2: 173–201
D. H. Schunk
Self-esteem in Adulthood Self-esteem refers to a global judgment of the worth or value of the self as a whole (similar to self-regard, selfrespect, and self-acceptance), or to evaluations of specific aspects of the self (e.g., appearance self-esteem or academic self-esteem). The focus of this article is on global self-esteem, which has distinct theoretical importance and consequences (Baumeister 1998, Rosenberg et al. 1995). Although thousands of studies of self-esteem have been published (Mruk 1995), many central questions about the nature, functioning, and importance of global self-esteem remain unresolved (Baumeister 1998).
Bibliography Ashton P T, Webb R B 1986 Making a Difference: Teachers’ Sense of Efficacy and Student Achieement. Longman, New York Bandura A 1986 Social Foundations of Thought and Action: A Social Cognitie Theory. Prentice-Hall, Englewood Cliffs, NJ Bandura A 1997 Self-efficacy: The Exercise of Control. Freeman, New York Pajares F 1996 Self-efficacy beliefs in academic settings. Reiew of Educational Research 66: 543–78 Pintrich P R, Cross D R, Kozma R B, McKeachie W J 1986 Instructional psychology. Annual Reiew of Psychology 37: 611–51 Schunk D H 1981 Modeling and attributional effects on children’s achievement: A self-efficacy analysis. Journal of Educational Psychology 73: 93–105 Schunk D H 1991 Self-efficacy and academic motivation. Educational Psychologist 26: 207–31 Schunk D H 1995 Self-efficacy and education and instruction. In: Maddux J E (ed.) Self-efficacy, Adaptation, and Adjustment: Theory, Research, and Application. Plenum, New York, pp. 281–303
13822
1. The Importance of Self-esteem Global self-esteem is a central aspect of the subjective quality of life. It is related strongly to positive affect and life satisfaction (Diener 1984), less anxiety (Solomon et al. 1991), and fewer depressive symptoms (Crandall 1973). High and low levels of self-esteem appear to be associated with different motivational orientations. High self-esteem people focus on selfenhancement and ‘being all that they can be,’ whereas low self-esteem people focus on self-protection and avoiding failure and humiliation (Baumeister et al. 1989). Low self-esteem has been proposed as a cause of many social problems, such as teenage pregnancy, aggression, eating disorders, and poor school achievement. However, evidence that low self-esteem is a cause, rather than a symptom, of these problems is scarce (Baumeister 1998, Dawes 1994, Mecca et al. 1989), and some researchers have suggested that high
Self-esteem in Adulthood self-esteem may actually be the cause of social problems such as aggression (Baumeister 1998). Because of unresolved issues regarding the nature and functioning of self-esteem and how to measure it, firm conclusions about whether high self-esteem is or is not socially useful seem premature.
2. Issues in the Self-esteem Literature 2.1 Trait or State? Psychologists assume typically that self-esteem is a psychological trait (i.e., that it is stable over time and across situations). In support of this view, self-esteem tends to be highly stable over long periods of time (see, e.g., Rosenberg 1979). Self-esteem is also a state, however, changing in response to events and experiences in the course of life (Heatherton and Polivy 1991). James (1890) suggested that self-esteem has qualities of both a state and a trait, rising and falling in response to achievements and setbacks relevant to one’s aspirations. On the other hand, he recognized that people tend to have average levels of self-esteem that are not linked directly to their objective circumstances. Although research supports James’ intuition, for some people self-esteem is relatively stable and trait-like across time, whereas for others it is more state-like, fluctuating daily (Kernis and Waschull 1995).
2.2 Affect or Cognitie Judgment? Researchers disagree about whether self-esteem is fundamentally a feeling or a judgment about the self. Global self-esteem and mood are correlated strongly, leading some to conclude that affect is a component of self-esteem (e.g., Brown 1993, Pelham and Swann 1989). Others argue that self-esteem is a cognitive judgment, based on standards of worth and accessible information about how well an individual is meeting those standards (see, e.g., Moretti and Higgins 1990). Current mood may be one source of information on which judgments of self-worth are based (Schwarz and Strack 1999). It seems likely that the relationship between self-esteem and mood is complex: both mood and self-esteem may be affected independently by life events; mood may be a source of information on which judgments of self-esteem are based; and mood may also be a consequence of having high or low selfesteem.
2.3 Where Does Self-esteem Come From? Why are some people high and others low in selfesteem? James (1890) suggested that global self-esteem
is determined by successes divided by pretensions, or how well a person is doing in areas that are important. Although it might seem logical that high self-esteem results from success in life (e.g., being smart, attractive, wealthy, and popular), these objective outcomes are related only weakly to self-esteem. For example, socioeconomic status (Twenge and Campbell 1999), physical attractiveness as rated by observers (Diener et al. 1995, Feingold 1992), obesity (Miller and Downey 1999), school achievement (Rosenberg et al. 1995), and popularity (Wylie 1979) are related only weakly to global self-esteem. A stronger relationship is observed between global self-esteem and how well people beliee they are doing in important domains, but the direction of this relationship is unclear. Believing one is doing well in important domains might cause high selfesteem, or people may think they are doing well because they have high self-esteem. Experimental studies have demonstrated that specific self-evaluations are sensitive to manipulated success or failure, but evidence that global self-esteem responds to such feedback is very scarce (Blascovich and Tomaka 1991). Mead (1934) and Cooley (1902) proposed that selfesteem develops in social relationships. Cooley (1902) argued that subjectively-interpreted feedback from others is a main source of information about the self. The self-concept arises from imagining how others perceive and evaluate the self. These ‘reflected appraisals’ affect self-perceptions and self-evaluations, resulting in what Cooley (1902) described as the ‘looking glass self.’ Mead (1934) argued that the looking glass self is a product of, and essential to, social interaction. To interact smoothly and effectively with others, people need to anticipate how others will react to them, and so they need to learn to see themselves through the eyes of others, either the specific people with whom they interact, or a generalized view of how most people see the self, or a ‘generalized other.’ Research indicates that self-esteem is related only weakly to others’ evaluations of the self, but is related strongly to beliefs about others’ evaluations (Shrauger and Schoeneman 1979). Evidence regarding the causal direction of this effect is scarce. A third view, which can encompass these others, is that self-esteem is a judgment of self-worth constructed on the basis of information and standards for the self that are available and accessible at the moment. People may differ in the self-standards that are chronically accessible to them (see, e.g., Higgins 1987). For example, some people may judge their self-worth chronically according to whether they are competent in important domains, whereas others judge their selfworth chronically according to whether others approve of them (Crocker and Wolfe 2001). In this view, self-esteem will be stable over time if the standards used to evaluate the self and information about how well an individual is doing relative to those standards are stable. When circumstances make alternative standards salient, or alter beliefs about how well the 13823
Self-esteem in Adulthood individual is doing relative to those standards, then self-esteem changes (Crocker 1999, Quinn and Crocker 1999).
2.4 Defensie or Genuine? Genuine self-esteem is usually assumed to be synonymous with self-worth, self-respect, and self-acceptance, despite awareness of one’s flaws and shortcomings. Yet, many studies have demonstrated that people who are high in self-esteem are more likely to make excuses for failure, derogate others when threatened, and have unrealistically positive views of themselves (see Baumeister 1998, Blaine and Crocker 1993, Taylor and Brown 1988 for reviews), behaviors which appear to be quite defensive. These findings have fueled the suspicion that many people who are outwardly high in self-esteem inwardly harbor serious doubts about their self-worth, and have defensive, rather than genuinely high, self-esteem. Although the distinction between genuine and defensively high selfesteem has a long history in psychology, researchers have had little success at distinguishing these two types of high self-esteem empirically. One view is that implicit, or nonconscious, evaluations of the self are dissociated from conscious selfevaluations (see, e.g., Greenwald and Banaji 1995). According to this view, genuine high self-esteem results from having high explicit (conscious) and implicit (nonconscious) self-esteem, whereas defensively high self-esteem results from having high explicit selfesteem and low implicit self-esteem. A measure of implicit self-esteem based on the Implicit Associations Test (Greenwald et al. 1998) shows the predicted dissociation between implicit and explicit measures, but to date research has not demonstrated that defensive behaviors such as blaming others for failure are associated uniquely with the combination of high explicit and low implicit self-esteem. Another view is that people who have stable high self-esteem are relatively nondefensive in the face of failure, whereas people with unstable high self-esteem are both defensive and hostile when they fail (see Kernis and Waschull 1995 for a review). According to Kernis and his colleagues, people with unstable high self-esteem have a high level of ego-involvement in everyday events, consequently their self-esteem is at stake even when relatively minor negative events occur. Considerable evidence has accumulated supporting the view that people with unstable high self-esteem are defensive, whereas people with stable high self-esteem are not. Crocker and Wolfe (2001) argue that instability of self-esteem results when outcomes in a person’s life are relevant to their contingencies, or conditions, of self-worth. Consequently, Crocker and Wolfe argue that people are defensive when they receive negative or threatening information in do13824
mains in which their self-esteem is contingent. To date, however, they have not provided empirical support for their view. In general, the issue of defensive vs. genuine selfesteem has focused attention on dimensions of self-esteem that go beyond whether it is high or low. This broader perspective on multiple dimensions of self-esteem might help resolve several issues in the selfesteem literature (Crocker and Wolfe 2001). For example, the role of self-esteem in social problems such as substance abuse and eating disorders may be linked to instability or contingencies of self-esteem as well as, or instead of, level of self-esteem.
2.5 A Cultural Uniersal? Psychologists have long assumed that there is a universal need to have high self-esteem. Consistent with this view, most people have high self-esteem, and will go to great lengths to achieve, maintain, and protect this. Yet, almost all of this research has been conducted in a North American cultural context, leaving open the possibility that the need for high selfesteem is a culturally specific phenomenon (see Heine et al. 1999 for a review). Levels of self-esteem in Asians are related to how much time they have spent in the USA or Canada (Heine et al. 1999). Furthermore, Asians and Asian-Americans, on average, do not show the same self-enhancing tendencies so characteristic of North Americans. Heine et al. argue that there are fundamental cultural differences in the nature and importance of self-esteem. Taking Japan as an example, they argue that, whereas in the USA and Canada people are self-enhancing and motivated to achieve, maintain, and protect high self-esteem, in Japan people are self-critical and motivated to improve the self. This self-critical orientation in Japan, they argue, is adaptive in a culture that values selfcriticism and self-improvement, and considers them to be evidence of commitment to the group. Consequently, in contrast to North America, self-criticism may lead to positive self-feelings resulting from living up to cultural standards that value self-criticism. The notion that the need for self-esteem and the motivation to self-enhance are not universal is a crucial development in self-esteem research.
3. Measurement of Self-esteem In 1974, Wylie reviewed research on self-esteem and criticized researchers for developing idiosyncratic measures of self-esteem, rather than well-established, psychometrically valid and reliable instruments. Although measurement of self-esteem has improved somewhat since Wylie’s critique (see Blascovich and Tomaka 1991 for a review), there remains a tendency
Self-esteem in Adulthood for researchers to develop idiosyncratic measures for particular studies. Issues in the measurement of selfesteem tend to reflect issues in its conceptualization. Measures of trait self-esteem assess global judgments of self-worth, self-respect, or self-regard that encourage respondents to consider how they usually or generally evaluate themselves (see, e.g., Rosenberg 1965), or assess evaluations of the self in several domains and create a composite score, on the assumption that such a composite is an indicator of global self-esteem (see, e.g., Coopersmith 1967). Measures of state self-esteem, on the other hand, focus on how the person feels at a specific moment in time. Some state measures assess momentary evaluations of the self in one or more domains such as appearance or performance (see, e.g., Heatherton and Polivy 1991). Others assess current mood or self-related affect with self-ratings on items such as feeling proud, important, and valuable (see, e.g., Leary et al. 1995). Both types of state self-esteem measures appear to be responsive to positive and negative events. Researchers interested in state self-esteem tend not to measure momentary or current self-worth, self-regard, or self-respect (i.e., global state self-esteem). Consequently, studies of state self-esteem and studies of trait self-esteem tend to measure different constructs, making results across these types of studies difficult to compare.
4. Future Directions Several important issues remain to be addressed by research. First, additional progress is needed in measurement and conceptualization of self-esteem, and in identifying and validating different types of selfesteem (e.g., defensive vs. genuine, implicit vs. explicit, stable vs. unstable, and contingent vs. noncontingent). Possible cultural and subcultural differences in the nature and functioning of self-esteem are a very important area needing further exploration. Only after progress has been made in these areas will researchers be able to provide definitive answers to questions about the social importance of self-esteem. See also: Intrinsic Motivation, Psychology of; Selfefficacy; Self-evaluative Process, Psychology of; Self: History of the Concept; Self-regulation in Adulthood; Well-being (Subjective), Psychology of
Bibliography Baumeister R F 1998 The self. In: Gilbert D T, Fiske S T, Lindzey G (eds.) Handbook of Social Psychology, 4th edn. McGraw-Hill, New York, pp. 680–740 Baumeister F, Tice D M, Hutton D G 1989 Self-presentational motivations and personality differences in self-esteem. Journal of Personality 57: 547–79
Blaine B, Crocker J 1993 Self-esteem and self-serving biases in reactions to positive and negative events: An integrative review. In: Baumeister R F (ed.) Self-esteem: The Puzzle of Low Self-regard. Erlbaum, Hillsdale, NJ, pp. 55–85 Blascovich J, Tomaka J 1991 Measures of self-esteem. In: Robinson J P, Shaver P R, Wrightsman L S (eds.) Measures of Personality and Social Psychological Attitudes. Academic Press, San Diego, CA, pp. 115–60 Brown J D 1993 Motivational conflict and the self: The doublebind of low self-esteem. In: Baumeister R F (ed.) Self-esteem: The Puzzle of Low Self-regard. Plenum, New York, pp. 117–30 Cooley C H 1902 Human Nature and the Social Order. Schocken, New York Coopersmith S 1967 The Antecedents of Self-esteem. W. H. Freeman, San Francisco, CA Crandall R 1973 The measurement of self-esteem and related constructs. In: Robinson J, Shaver P R (eds.) Measures of Social Psychological Attitudes. Institute for Social Research, Ann Arbor, MI Crocker J 1999 Social stigma and self-esteem: Situational construction of self-worth. Journal of Experimental Social Psychology 35: 89–107 Crocker J, Wolfe C T 2001 Contingencies of self-worth. Psychological Reiew 108: 593–623 Dawes R M 1994 House of Cards: Psychology and Psychotherapy Built on Myth. Free Press, New York Diener E 1984 Subjective well-being. Psychological Bulletin 95: 542–75 Diener E, Wolsic B, Fujita F 1995 Physical attractiveness and subjective well-being. Journal of Personality and Social Psychology 69: 120–29 Feingold A 1992 Good-looking people are not what we think. Psychological Bulletin 111: 304–41 Greenwald A G, Banaji M R 1995 Implicit social cognition: Attitudes, self-esteem, and stereotypes. Psychological Reiew 102: 4–27 Greenwald A G, McGhee D E, Schwarz J L K 1998 Measuring individual differences in implicit cognition: The implicit association test. Journal of Personality and Social Psychology 74: 1464–80 Heatherton T F, Polivy J 1991 Development and validation of a scale for measuring state self-esteem. Journal of Personality and Social Psychology 60: 895–10 Heine S J, Lehman D R, Markus H R, Kitayama S 1999 Is there a universal need for positive self-regard? Psychological Reiew 106: 766–94 Higgins E T 1987 Self-discrepancy: A theory relating self and affect. Psychological Reiew 94: 319–40 James W 1890 The Principles of Psychology. Harvard University Press, Cambridge, MA, Vol. 1 Kernis M H, Waschull S B 1995 The interactive roles of stability and level of self-esteem: Research and theory. In: Zanna M P (ed.) Adances in Experimental Social Psychology. Academic Press, San Diego, CA, Vol. 27, pp. 93–141 Leary M R, Tambor E S, Terdal S K, Downs D L 1995 Selfesteem as an inter-personal monitor: The sociometer hypothesis. Journal of Personality and Social Psychology 68: 518–30 Mead G H 1934 Mind, Self, and Society. University of Chicago Press, Chicago, IL Mecca A M, Smelser N J, Vasconcellos J (eds.) 1989 The Social Importance of Self-esteem. University of California Press, Berkeley, CA
13825
Self-esteem in Adulthood Miller C T, Downey K T 1999 A meta-analysis of heavyweight and self-esteem. Personality and Social Psychology Reiew 3: 68–84 Moretti M M, Higgins E T 1990 Relating self-discrepancy to self-esteem: The contribution of discrepancy beyond actual self-ratings. Journal of Experimental Social Psychology 26: 108–23 Mruk C 1995 Self-esteem: Research, Theory, and Practice. Springer, New York Pelham B W, Swann W B Jr 1989 From self-conceptions to selfworth: On the sources and structure of global self-esteem. Journal of Personality and Social Psychology 57: 672–80 Quinn D M, Crocker J 1999 When ideology hurts: Effects of feeling fat and the protestant ethic on the psychological wellbeing of women. Journal of Personality and Social Psychology 77: 402–14 Rosenberg M 1965 Society and the Adolescent Self-image. Princeton University Press, Princeton, NJ Rosenberg M 1979 Conceiing the Self. Basic Books, New York Rosenberg M, Schooler C, Schoenbach C, Rosenberg F 1995 Global self-esteem and specific self-esteem: Different concepts, different outcomes. American Sociological Reiew 60: 141–56 Schwarz N, Strack F 1999 Reports of subjective well-being: Judgmental processes and their methodological implications. In: Kahneman D, Diener D, Schwarz N (eds.) Well-being: Foundations of Hedonic Psychology. Russell Sage, New York, pp. 61–84 Shrauger J S, Schoeneman T J 1979 Symbolic interactionist view of self-concept: Through the looking-glass darkly. Psychological Bulletin 86: 549–73 Solomon S, Greenberg J, Pyszczynski T 1991 A terror management theory of social behavior: The psychological functions of self-esteem and cultural worldviews. In: Zanna M P (ed.) Adances in Experimental Social Psychology. Academic Press, San Diego, CA, Vol. 24, pp. 91–159 Taylor S E, Brown J D 1988 Illusion and well-being: A socialpsychological perspective on mental health. Psychological Bulletin 103: 193–210 Twenge J M Campbell W K 1999 Does Self-esteem Relate to Being Rich, Successful, and Well-educated? A Meta-analytic Reiew. Manuscript, Case-Western Reserve University Wylie R C 1974 The Self-concept (rev. edn.). University of Nebraska Press, Lincoln, NE Wylie R C 1979 The Self-concept: Theory and Research on Selected Topics, 2nd edn. University of Nebraska Press, Lincoln, NE, Vol. 2
J. Crocker
Self-evaluative Process, Psychology of This article explores some of the processes associated with evaluative response to one’s own self. Evaluation refers to a response registering the idea\feeling that some aspect of one’s world is good or bad, likeable or dislikable, valuable or worthless. It is one of our most fundamental psychological responses. Indeed, research on the psychology of meaning has revealed that 13826
evaluation is the single most important aspect of meaning. Evaluative responses are fast; there is evidence that we have an evaluation of some things even before we are completely able to recognize them. Evaluative responses are sometimes automatic—we do not set out to make them and we often cannot turn them off. Moreover, evaluation often colors our interpretation of situations. For example, ambiguous actions of people we like are interpreted more benevolently than the same actions of people we do not like. An evaluative response attached to the self is often termed self-esteem and we will use the terms selfesteem and self-evaluation interchangeably.
1. Indiidual Differences in Self-ealuation There are literally thousands of studies reported since the 1950s measuring self-esteem and comparing persons who are high with persons who are low on this dimension (see Self-esteem in Adulthood). Most frequently, self-esteem is assessed by self-report. One of the most popular measures (Rosenberg 1965) consists of ten items like ‘I am a person of worth’ followed by a series of graded response options, e.g., strongly agree, agree, disagree, strongly disagree. Such measures have proven to be reliable and valid. However, they are subject to the same general criticisms of any self-report measure: Scores can be distorted by the tendency to agree with an item regardless of its content and the tendency to try to create a favorable impression. Moreover, there may be aspects of one’s selfevaluation that are not easily accessible to conscious awareness. To address some of these concerns ‘implicit’ measures of self-esteem are currently being explored (Greenwald and Banaji 1995). Most of these measures work by priming the self, i.e., making the self salient, and then measuring the impact of self-salience on other evaluative responses. For example, when the self is primed, the more positive the self-evaluation the faster one should be in making other positive evaluative judgments. As of this writing, implicit measures of self-evaluation show great promise but it is still unclear what impact such measures will have on our ultimate understanding of self-evaluation. Individual differences in self-evaluation have been associated with a variety of psychological traits. For example, compared to persons with low self-esteem, persons with high self-esteem tend to achieve more in school, be less depressed, better adjusted, less socially anxious and more satisfied with life, etc. Going through this research almost leads one to draw the conclusion that all the good things in life are positively associated with self-esteem. Indeed, the intuition that high self-esteem is good is so compelling that the state of California even put together a task force to promote self-esteem. There are, of course, many arguments for the positive impact of self-esteem. However, most of
Self-ealuatie Process, Psychology of the studies rely on correlational methods. Correlational methodology makes it difficult to know if selfesteem is a cause or an effect of these other variables. For example, it may be that high self-esteem leads to school achievement but it may also be that school achievement improves self-esteem. It may also be that the correlation between achievement and self-esteem is not causal at all; each may be caused by the same third variable, e.g., general health. Recent research is beginning to correct the simple view of self-esteem as always ‘good.’ For example, it may be persons who are high in self-esteem rather than persons who are low in self-esteem that are most likely to be aggressive (Baumeister et al. 1996). Why? Persons high in self-esteem have more to lose when confronted by failure or a personal affront. Related to this suggestion is the observation that self-esteem may be stable in some persons but unstable in others. Stability of self-esteem is consequential (Kernis and Waschull 1995). Persons whose self-esteem is high on the average but whose self-evaluation fluctuates over time score higher on a hostility measure than persons who are high in self-esteem but whose self-evaluation is stable. Perhaps it is persons who aspire to feel positive about themselves but are unsure of themselves that tend to fluctuate in their self-evaluation and to respond aggressively to threats to self-esteem.
2. Self Moties The idea that persons strive to maintain a positive selfevaluation is obvious. It is not difficult to notice that people respond positively to success and compliments and negatively to failure and insults. They tend to seek out persons who respect their accomplishments and situations in which they can do well. In spite of the obviousness and ubiquity of a self-enhancement motive, at least two other motives have captured some research attention. One is the motive for self-knowledge, i.e., a self-assessment motive, and the other is a consistency or self-verification motive. Feeling good about ourselves can take us only so far. It is also important to have accurate knowledge about the self. Leon Festinger (1954) suggested that we have a drive to evaluate our abilities and opinions. Indeed, we seem to be fascinated by information about ourselves. We want to know what others think of us. We go to psychotherapists to learn more about ourselves. We are even curious about what the stars have to say about our lives (note the popularity of horoscopes). Systematic research has varied the diagnosticity of experimental tasks, i.e., the extent to which we believe the task is truly revealing of our abilities. Under some conditions, for example, when certainty is low or when we are in a particularly good mood, we prefer diagnostic feedback to flattering feedback. Thus, there is evidence for the self-assessment motive.
Another motive that has received some research attention is the tendency to verify one’s current view of the self (Swann 1990). According to this point of view, people are motivated to confirm their self-view. They will seek out persons and situations that provide belief consistent feedback. If a person has a positive view of self, he or she will seek out others who also evaluate them positively and situations in which they can succeed. Note that this is exactly the same expectation that could be derived from a self-enhancement point of view. However, self-verification and self-enhancement predictions diverge when a person has a negative view of self. The self-verification hypothesis predicts that persons with a negative self-view will seek out others who also perceive them negatively and situations that will lead to poor performance. There is some evidence for the self-verification prediction. On the other hand, neither the self-assessment motive nor the self-verification motive appears to be as robust as the self-enhancement motive (Sedikides 1993).
3. Self-enhancement As noted above, self-enhancement processes are frequent, easy to observe, and robust. William James and a host of contemporary workers have focused on particular mechanisms by which self-evaluation is affected. For example, James suggests that threats to self-esteem are stronger if they involve abilities on which the self has ‘pretensions’ or aspires to do well. Others have noted that feelings of success and failure that affect self-evaluation often come from comparison with other persons. Still other investigators have shown the impact of self-enhancement on cognition. For example, the kinds of causal attributions people make for their own successes and failures are often self-serving. Persons tend to locate the causes of success internally (due to my trying, ability) and the causes of failure externally (due to bad luck, task difficulty). The self-enhancement mechanisms proposed in the psychological literature have been so numerous and so diverse that the collection of them has sometimes been dubbed the ‘self zoo.’ However, three general classes of mechanism encompass many of the proposed self-enhancement\protection mechanisms: Social comparison, inconsistency reduction, and value expression.
3.1 Social Comparison One large class of self-enhancement mechanisms concerns social comparisons. The Self-Evaluation Maintenance (SEM) model (Tesser 1988), for example, proposes that when another person does better than we do at some activity, our own self-evaluation is affected. The greater our ‘closeness’ to the other person 13827
Self-ealuatie Process, Psychology of (through similarity, contiguity, personal relationship, etc.) the greater the effect on our self-evaluation. Being outperformed by another can lower self-evaluation by inviting unflattering self-comparison, or it can raise self-evaluation, a kind of ‘basking in reflected glory’. (Examples of basking are seen in statements like, ‘That’s my friend Bob, the best widget maker in the county.’) The releanceof the performance domain determines the relative importance of these opposing processes. If the performance domain is important to one’s self-definition, i.e., high relevance, then the comparison process will be dominant. One’s selfevaluation will be threatened by a close other’s better performance. If the performance domain is unimportant to one’s self-definition, i.e., low relevance, then the reflection process will be dominant. One’s self-evaluation will be augmented by a close other’s better performance. Thus, combinations of relative performance, closeness and relevance, are the antecedents of self-esteem threat or enhancement. The assumption that people are motivated to protect or enhance self-evaluation, combined with the sketch of how another’s performance affects self-evaluation, provides the information needed to predict selfevaluation maintenance behavior. An example: Suppose Nancy learns that she made a Bj on the test. The only other person from Nancy’s dormitory in this chemistry class, Kaela, made an Aj. This should be threatening to Nancy: Kaela outperformed her; Kaela is psychologically close (same dormitory); and chemistry is high in relevance to Nancy who is studying to be a doctor. What can Nancy do to reduce this threat and maintain a positive self-evaluation? She can change the performance differential by working harder herself or by preventing Kaela from doing well, e.g., hide the assignments, put the wrong catalyst in Kaela’s beaker. She can reduce her psychological connection to Kaela, e.g. change dorms and avoid the same classes. Alternatively, she can convince herself that this performance domain is not self-relevant, e.g., chemistry is not highly relevant to the kind of medicine in which she is most interested. Laboratory experiments have produced evidence for each of these modes of dealing with social comparison threat to selfevaluation.
3.2 Cognitie Consistency The number of variations within this approach to selfevaluation regulation is also substantial. An example of this approach is cognitive dissonance theory (Festinger 1957). According to dissonance theory, self-esteem is threatened by inconsistency. Holding beliefs that are logically or ‘psychologically’ inconsistent, i.e., dissonant, with one another is uncomfortable. For example, suppose a student agrees to a request to write an essay in favor of a tuition increase 13828
at her school. Her knowledge that she is opposed to a tuition increase is dissonant with her knowledge that she agreed to write an essay in favor of a tuition increase. One way to reduce this threatening dissonance is for the student to change her attitude to be more in favor of a tuition increase. Note that social comparison mechanisms and consistency reduction mechanisms are both self-enhancement strategies, yet they seem to have little in common. Threat from dissonance rarely has anything to do with the performance of another, i.e., social comparison. Similarly, inconsistency is generally irrelevant to an SEM threat, whereas other’s performance is crucial. Attitude change is the usual mode of dissonance threat reduction; on the other hand, changes in closeness, performance, or relevance are the SEM modes.
3.3 Value Expression The notion that expressing one’s most cherished values can affect self-esteem also has a productive history in social psychology. Simply expressing who we are, affirming our important values seems to have a positive effect on self-evaluation. According to self-affirmation theory (e.g., Steele 1988), self-evaluation has at its root a concern with a sense of global self-integrity. Selfintegrity refers to holding self conceptions and images that one is ‘adaptively and morally adequate, that is, as competent, good, coherent, unitary, stable, and capable of free choice, capable of controlling important outcomes, and so on’ (Steele 1988, p. 262). If the locus of a threat to self-esteem is self-integrity then the behavior to reduce that threat is self-affirmation or a declaration of the significance of an important selfvalue. Again, note that as a self-enhancement strategy, affirming a cherished value is qualitatively different from the SEM behaviors of changing closeness, relevance or performance or the dissonance behavior of attitude change.
3.4 Putting It All Together We have briefly described three classes of self-enhancement mechanisms: Social comparison, cognitive consistency and value expression. Each of these mechanisms is presumed to regulate self-evaluation, yet they are strikingly different from one another. These differences raise the question of whether selfevaluation is a unitary system or whether there are three (or more) independent self-evaluation systems. The goal of the self-enhancement motive is to maintain positive self-esteem. If there is a unitary self-evaluation system, the various self-evaluation mechanisms should substitute for one another. For example, if behaving inconsistently reduces self-evaluation then a positive social comparison experience, being part of the same
Self-ealuatie Process, Psychology of system, should be able to restore self-evaluation. On the other hand, if there are separate self-evaluation systems then a positive social comparison experience will not be able to restore a reduction in self-evaluation originating with inconsistent behavior. One would have to reduce the inconsistency to restore one’s selfevaluation. Recent research favors the former interpretation. At least under certain circumstances, the three self-enhancement mechanisms are mutually substitutable for one another in maintaining self-evaluation. In short, self-evaluation appears to be a unitary system with multiple processes for regulating itself (Tesser et al. 1996).
evidence of motives for self-accuracy and for selfverification. At least three processes affect self-evaluation: social comparison, cognitive consistency, and value expression. Although these processes are qualitatively different from one another, they are substitutable for one another in maintaining self-esteem. Self-enhancement is thought to have evolutionary roots in the individual’s connections to groups. See also: Cognitive Dissonance; Self-esteem in Adulthood; Self-monitoring, Psychology of; Self-regulation in Adulthood; Social Comparison, Psychology of
4. The Origins of Self-ealuation Psychologists have only recently begun to think about the origins of the self-enhancement motive. One line of research suggests that the self-enhancement motive grows out of an instinct for self-preservation coupled with knowledge of our own mortality. Although we as individuals may not live on, we understand that the culture of which we are a part does live on. ‘Immortality’ comes from our connection with our culture. Self-evaluation is a psychological indicator of the extent to which we are connected and acceptable to our culture and, hence, an index of our own ‘immortality’ (Pysczynski et al. 1997). Another line of work builds on the observation that evolution has predisposed us to be social, gregarious animals who are highly dependent on group living. We wish to maintain a positive self-esteem because self-esteem is a kind of ‘sociometer’ that indicates the extent to which we are regarded positively or negatively by others (Leary et al. 1995). Also taking an evolutionary perspective, the SEM model builds on the sociometer idea in two ways. It suggests that groups differ in the power they have to affect self-esteem. Compared to psychologically distant groups, psychologically close groups are typically more consequential to our well being and they have greater impact on our self-esteem. The SEM model also suggests that division of labor is fundamental to groups to maximize efficiency and to avoid conflict. Consequently, self-evaluation is more sensitive to feedback regarding the self’s own niche in the group. See Beach and Tesser (in press) for discussion.
5. Summary Self-evaluation has a productive history in psychology. Individual differences in self-esteem tend to be correlated with a number of positive attributes such as school achievement, general happiness, and lack of depression. There is a strong tendency for people to maintain a positive self-esteem but there is also
Bibliography Baumeister R F, Smart L, Boden J M 1996 Relation of threatened egotism to violence and aggression: The dark side of high self-esteem. Psychological Reiew 103: 5–33 Beach S R H, Tesser A in press Self-evaluation maintenance and evolution: Some speculative notes. In: Suls J, Wheeler L (eds.) Handbook of Social Comparison. Lawrence Erlbaum Associates, Mahwah, NJ Festinger L 1954 A theory of social comparison processes. Human Relations 7: 117–40 Festinger L 1957 A Theory of Cognitie Dissonance. Row, Peterson, Evanston, IL Greenwald A G, Banaji M R 1995 Implicit social cognition: attitudes, self-esteem, and stereotypes. Psychological Reiew 102: 4–27 James W 1905 The Principles of Psychology. Holt, New York, Vol. 1 Kernis M H, Waschull S B 1995 The interactive roles of stability and level of self-esteem: Research and theory. In: Berkowitz L (ed.) Adances in Experimental Social Psychology. Academic Press, San Diego, CA, Vol. 27, pp. 93–141 Leary M R, Tambor E S, Terdal S K, Downs D L 1995 Selfesteem as an interpersonal monitor: The sociometer hypothesis. Journal of Personality and Social Psychology 68: 518–30 Pysczynski T, Greenberg J, Solomon S 1997 Why do we need what we need? A terror management perspective on the roots of human social motivation. Psychological Inquiry 8: 1–20 Rosenberg M 1965 Society and the Adolescent Self-image. Princeton University Press, Princeton, NJ Sedikides C 1993 Assessment, enhancement, and verification determinants of the self-evaluation process. Journal of Personality and Social Psychology 65(2): 317–38 Steele C M 1988 The psychology of self-affirmation: Sustaining the integrity of the self. In: Berkowitz L (ed.) Adances in Experimental Social Psychology. Academic Press, San Diego, CA, Vol. 21, pp. 261–302 Swann W B 1990 To be adored or to be known? The interplay of self-enhancement and self-verification. In: Sorrentino R M, Higgins E T (eds.) Handbook of Motiation & Cognition. Guilford Press, New York, Vol. 2, pp. 408–48 Tesser A 1988 Toward a self-evaluation maintenance model of social behavior. In: Berkowitz L (ed.) Adances in Experimental Social Psychology. Academic Press, San Diego, CA, Vol. 21, pp. 181–227
13829
Self-ealuatie Process, Psychology of Tesser A, Martin L, Cornell D 1996 On the substitutability of self-protective mechanisms. In: Gollwitzer P M, Bargh J A (eds.) The Psychology of Action: Linking Motiation and Cognition to Behaior. Guilford Press, New York, pp. 48–67
A. Tesser
Self-fulfilling Prophecies A self-fulfilling prophecy occurs when an originally false social belief leads to its own fulfillment. The selffulfilling prophecy was first described by Merton (1948), who applied it to test anxiety, bank failures, and discrimination. This article reviews some of the controversies surrounding early self-fulfilling prophecy research, and traces how those controversies have led to modern research on relations between social beliefs and social reality. Self-fulfilling prophecies did not receive much attention until Rosenthal and Jacobson’s (1968) Pygmalion study. Teachers were led to believe that randomly selected students would show dramatic increases in IQ over the school year. Results seemed to show that, especially in the earlier grade levels, those students gained more in IQ than other students. Thus, the teachers’ initially false belief that some students would show unusual IQ gains became true.
1. Controersy, Replication, and Meta-analysis Rosenthal and Jacobson’s study (1968) was highly controversial. Although it seemed to explain the low achievement of disadvantaged students, it was criticized on methodological and statistical grounds. This controversy inspired attempts at replication. Only about one third of these early attempts succeeded (Rosenthal and Rubin 1978). Critics concluded that the phenomenon was unreliable. Proponents concluded that this demonstrated the existence of selffulfilling prophecies because, if chance differences were occurring, replications would only succeed 5 percent of the time. This controversy inspired Rosenthal’s work on meta-analysis—statistical techniques for summarizing the results of multiple studies. Rosenthal and Rubin’s (1978) meta-analysis of the first 345 studies of interpersonal expectancy effects conclusively demonstrated that self-fulfilling prophecies are a real and reliable phenomenon. That meta-analysis also showed that they were neither pervasive (nearly two-thirds of the studies failed to find the effect) nor powerful (effect sizes, in terms of correlation or regression coefficients, averaged 0.2–0.3). This, however, did not end the controversy. Although few modern researchers dispute the existence of self-fulfilling prophecies in general, several do dispute the claims that teacher expectations influence student intelligence (Snow 1995). For example, Snow 13830
(1995) concluded that: (a) the expectancy effect disappears if one removes five students with implausible IQ increases (100 points within one year) from the Rosenthal and Jacobson (1968) study, and (b) the literature fails to demonstrate an effect of teacher expectations on IQ.
2. Self-fulfilling Stereotypes Some researchers saw in self-fulfilling prophecies an explanation for social inequalities. Thus, in the 1970s, research began addressing the self-fulfilling effects of stereotypes. The main ideas were that; if most stereotypes were inaccurate and if self-fulfilling prophecies were common and powerful (as was believed) then negative stereotypes regarding intelligence, achievement, motivation, etc., may produce self-fulfilling prophecies that lead individuals from devalued groups to objectively confirm those stereotypes. The early experimental research seemed to support this perspective: (a) White interviewers’ racial stereotypes could undermine the performance of Black interviewees. (b) Males acted more warmly towards, and evoked warmer behavior from, female interaction partners erroneously believed to be more physically attractive. (c) When interacting with a sexist male who was either physically attractive or who was interviewing them for a job, women altered their behavior to appear more consistent with traditional sex stereotypes. (d) Teachers used social class as a major basis for expectations and treated students from middle class backgrounds more favorably than students from lower class backgrounds (see reviews by Jussim et al. 1996 and Snyder 1984).
3. Widespread Acceptance and More Questions The role of self-fulfilling prophecies, however, in creating social problems remained unclear because many of the early self-fulfilling prophecy experiments suffered important limitations. In most, if the expectancy manipulation was successful, perceivers developed erroneous expectations. However, under naturalistic conditions perceivers may develop accurate expectations, which, by definition, do not create self-fulfilling prophecies (because self-fulfilling prophecies begin with an initially false belief ). Three categories of research have addressed the limitations of the early experiments in different ways.
3.1 Naturalistic Studies of Teacher Expectations Longitudinal, quantitative investigations of naturally occurring teacher expectancies addressed the accuracy problem directly. All assessed relations between
Self-fulfilling Prophecies teacher expectations and students’ past and future achievement. If teacher expectations predicted future achievement beyond effects accounted for by students’ past achievement, results were interpreted as providing evidence consistent with self-fulfilling prophecies. These studies also provided data capable of addressing two related questions: (a) How large are naturally occurring teacher expectation effects? (b) Do teacher expectations predict students’ achievement more because they create self-fulfilling prophecies or more because they are accurate? The results were consistent: (a) In terms of standardized regression coefficients, the self-fulfilling effects of teacher expectations were about 0.1–0.2. (b) Teacher expectations were strongly based on students’ past achievement. (c) Teachers’ expectations predicted students’ achievement more because they were accurate than because they led to self-fulfilling prophecies. (d) Even teachers’ perceptions of differences between students from different demographic groups (i.e., stereotypes) were mostly accurate (Jussim et al. 1996).
3.2 Naturalistic Studies of Close Relationships Recent research has begun investigating the occurrence of self-fulfilling prophecies in close relationships. Children come to view their math abilities in a manner consistent with their mothers’ sex stereotypes (Jacobs and Eccles 1992). New college roommates change each others’ self-perceptions of academic and athletic ability (McNulty and Swann 1994). People who feel anxious that their romantic partners will reject them often evoke rejection from those partners (Downey et al. 1998). Furthermore, the more positive the illusions one holds regarding one’s romantic partner the longer that relationship is likely to continue and the more positively one’s romantic partner will come to view him or herself (Murray et al. 1996). As with the teacher expectation studies, however, self-fulfilling prophecy effect sizes average about 0.2.
3.3 Nonconscious Priming of Stereotypes Chen and Bargh (1997) avoided inducing false expectations by nonconsciously priming a stereotype, and observing its effect on social interaction. First, they subliminally presented to perceivers either AfricanAmerican or White faces. Perceivers and targets (all of whom were White) were then placed into different rooms where they communicated through microphones and headphones. These interactions were recorded and rated for hostility. Perceivers primed with an African-American face were rated as more
hostile, and targets interacting with more hostile perceivers reciprocated with greater hostility themselves. The expectancy effect size was 0.23.
4. Moderators Failures to replicate and generally small effect sizes prompted some researchers to begin searching for moderators—factors that inhibit or facilitate selffulfilling prophecies (Jussim et al. 1996). This research has identified some conditions under which powerful self-fulfilling prophecies occurred and many conditions under which self-fulfilling prophecies did not occur. Identified moderators include characteristics of perceivers, targets, and the situation.
4.1 Perceier Moderators (a) Perceivers motivated to be accurate or sociable are not likely to produce self-fulfilling prophecies. (b) Perceivers motivated to confirm a particular belief about a target, or to arrive at a stable impression of a target are more likely to produce self-fulfilling prophecies. (c) Perceivers with a rigid cognitive style or who are certain of their beliefs about targets are more likely to produce self-fulfilling prophecies.
4.2 Target Moderators (a) Unclear self-perceptions lead people to become more vulnerable to self-fulfilling prophecies. (b) When perceivers have something targets want (such as a job), targets often confirm those beliefs in order to create a favorable impression. (c) When targets desire to facilitate smooth social interactions, they are also more likely to confirm perceivers’ expectations. (d) When targets believe that perceivers hold a negative belief about them, they often act to disconfirm that belief. Similarly, when their main goal is to defend a threatened identity, or express their personal attributes, they are also likely to disconfirm perceivers’ expectations. (e) Self-fulfilling prophecies are stronger among students from at least some stigmatized social groups (African-American students, students from lower social class backgrounds, and students with a history of low achievement).
4.3 Situational Moderators (a) Self-fulfilling prophecies are most common when people enter new situations, such as kindergarten or military service. 13831
Self-fulfilling Prophecies (b) Experimental studies conducted in educational contexts were much more likely to obtain self-fulfilling prophecies if the expectancy manipulation occurred early in the school year (presumably, because teachers were more open to the information at that time).
5. Accumulation Small self-fulfilling prophecy effects, if they accumulate over time, might lead to large differences between targets. If, for example, stereotype-based expectations lead to small differences in the intellectual achievement of students from middle class or poor backgrounds each year, those differences may accumulate over time to lead to large social class differences in achievement. This argument lies at the heart of claims emphasizing the power of self-fulfilling prophecies to contribute to social problems. Self-fulfilling prophecies in the classroom, however, do not accumulate. All studies examining this issue have failed to find accumulation and, instead, have generally found that teacher expectation effects dissipate over time (Smith et al. 1999). Whether selffulfilling prophecies accumulate outside of the classroom is currently unknown.
6. Future Directions 6.1 Naturalistic Studies Outside of Classrooms Nearly all of the early naturalistic research focused on teacher expectations. Thus, the recent emergence of research on self-fulfilling prophecies in close relationships has been sorely needed, and will likely continue. The expectations that parents, employers, therapists, coaches, etc. develop regarding their children, employees, clients, athletes, etc. are all rich areas for future research. 6.2 Stereotype Threat Stereotype threat refers to concern that one’s actions may fulfill a negative cultural stereotype of one’s group (Steele 1997). Such concerns may, paradoxically, lead to the fulfillment of those stereotypes. For example, African-American students who believe they are taking a test of intelligence (triggering potential concern about confirming negative cultural stereotypes regarding African-American intelligence) perform worse than White students; however, when led to believe that the same test is one of ‘problem-solving,’ the differences evaporate. Similar patterns have occurred for women taking standardized math tests (Steele 1997), and for middle class and poor students on intelligence tests (Croizet and Claire 1998). Stereotype threat is a relatively new concept in the social 13832
sciences, and has been thus far used primarily to explain demographic differences in standardized test performance. In addition, it helps identify how cultural stereotypes (beliefs about the widespread beliefs regarding groups) may be self-fulfilling, even in the absence of a specific perceiver with an inaccurate stereotype. As such, it promises to remain an important topic in the social sciences for some time.
7. Conclusion Self-fulfilling prophecies are pervasive in the sense that they occur in many different contexts. They are not pervasive in the sense that self-fulfilling prophecy effect sizes are typically small, and many studies have failed to find them. Because of the alleged power of expectancy effects to create social problems, teachers have sometimes been accused of perpetrating injustices based on race, class, sex, and other demographic categories. This accusation is unjustified. Teacher expectations predict student achievement primarily because those expectations are accurate. Furthermore, even when inaccurate, teacher expectations do not usually influence students very much; and even when they do influence students, such influence is likely to dissipate over time. Sometimes, however, both inside and outside the classroom, self-fulfilling prophecies can be powerful. In the classroom, the effects among some groups (low achievers, African-Americans, students from lower social class backgrounds) have been quite powerful. Although self-fulfilling prophecies in the classroom do not accumulate, they can be very long lasting— detectable as many as six years after the original teacher-student relationship (Smith et al. 1999). Outside the classroom, recent research has demonstrated the potentially important role of self-fulfilling prophecies in close relationships, and in the maintenance of socio-cultural stereotypes. Thus, self-fulfilling prophecies occur in a wide variety of contexts and are a major phenomenon linking social perception to social behavior. See also: Decision Making, Psychology of; Personality and Conceptions of the Self; Self-concepts: Educational Aspects; Stereotypes, Social Psychology of; Teacher Behavior and Student Outcomes
Bibliography Chen M, Bargh J A 1997 Nonconscious behavioral confirmation processes: The self-fulfilling consequences of automatic stereotype activation. Journal of Experimental Social Psychology 33: 541–60 Croizet J, Claire T 1998 Extending the concept of stereotype and threat to social class: The intellectual underperformance of students from low socioeconomic backgrounds. Personality and Social Psychology Bulletin 24: 588–94
Self: History of the Concept Downey G, Freitas A L, Michaelis B, Khouri H 1998 The selffulfilling prophecy in close relationships: Rejection sensitivity and rejection by romantic partners. Journal of Personality and Social Psychology 75: 545–60 Jacobs J E, Eccles J S 1992 The impact of mothers’ gender-role stereotypic beliefs on mothers’ and children’s ability perceptions. Journal of Personality and Social Psychology 63: 932–944 Jussim L, Eccles J, Madon S 1996 Social perception, social stereotypes, and teacher expectations: Accuracy and the quest for the powerful self-fulfilling prophecy. Adances in Experimental Social Psychology 28: 281–388 Merton R K 1948 The self-fulfilling prophecy. Antioch Reiew 8: 193–210 McNulty S E, Swann W B 1994 Identity negotiation in roommate relationships: The self as architect and consequence of social reality. Journal of Personality and Social Psychology 67: 1012–23 Murray S L, Holmes J G, Griffin D W 1996 The self-fulfilling nature of positive illusions in romantic relationships: Love is not blind, but prescient. Journal of Personality and Social Psychology 71: 1155–80 Rosenthal R, Jacobson L 1968 Pygmalion in the classroom: Teacher expectations and pupils’ intellectual deelopment. Holt, Rinehart, and Winston, New York Rosenthal R, Rubin D B 1978 Interpersonal expectancy effects: The first 345 studies. Behaioral and Brain Sciences 1: 377–415 Smith A E, Jussim L, Eccles J 1999 Do self-fulfilling prophecies accumulate, dissipate, or remain stable over time? Journal of Personality and Social Psychology 77: 548–65 Snow R E 1995 Pygmalion and intelligence? Current Directions in Psychological Science 4: 169–71 Snyder M 1984 When belief creates reality. Adances in Experimental Social Psychology 18: 247–305 Steele C M 1997 A threat in the air: How stereotypes shape intellectual identity and performance. American Psychologist 52: 613–29
L. Jussim
Self: History of the Concept It is often asserted that the concept of the self emerges only in early modern times in connection with the concern for subjectivity that is taken to be characteristic of modernity. While it is true that the term ‘self’ as a noun, describing that which in a person is really and intrinsically this person, comes into use only from the seventeenth century onwards, what can be called the question of selfhood was not unfamiliar to earlier thinkers. This is, in its core, the question whether there is—and if so, what is—some unity, or at least continuity and coherence, of a human being over the life-course beyond their bodily constitution, beyond the unity of the body. In the history of human thought, this question has been answered in a great variety of different ways. Broadly, empirical traditions tended to doubt the
existence of such unity. They could observe changes, even radical transformations, in the human mind and had thus no ground to postulate any a priori unity. Transcendental traditions, in contrast, tended to argue that there had to be a unity of apperception, or of consciousness. Otherwise, not even the question of the self could be asked. In the intellectual space between these two positions, so to say, the issue could be addressed in ways that are more specific to the social and psychological sciences. Then, the faculty of memory, for instance, could be seen as enabling a sense of continuous self to emerge. Human subjects have the ability to narrate their lives. Or, selves could be seen as formed in the interaction with others, that is, in the mirroring of an image of oneself through the responses to one’s own words and actions by others. In moral philosophy, the existence of a continuous self was seen as a precondition for holding human beings accountable for their past deeds. Selfhood was thus linked to moral and political responsibility.
1. Selfhood in the Social Sciences In these forms, the question of the self was already posed at the historical moment when the social sciences, by and large in the contemporary understanding of the term, arose towards the end of the eighteenth century. It is noteworthy, then, that the emerging social sciences rather reduced the range of ways of exploring selfhood across the nineteenth century. In some of their fields, like liberal political philosophy and political economy\economics, they postulated a rational self, able to make choices and to be responsible for her, or rather: his deeds. In other fields, in particular in the sociological way of reasoning, the orientations and behaviors of human beings were seen as determined by their social position. This opposition has become known as the one between an under- and an oversocialized conception of the human being (Wrong 1961). More cautiously, the empirical and behavioral strands of social research restricted themselves to observing human behavior and refrained from making any assumptions about selfhood at all. Regularities emerged here only through the aggregation of observations. All these approaches had in common, though, was that they aimed at regularizing and stabilizing human orientations and behavior. Whether human beings were under- or oversocialized or just happened to behave according to patterns, the broader question of selfhood, namely whether and how a sense of continuity and coherence in a human being forms, was rather neglected or answered a priori by theoretical postulate. In this light, it seems appropriate to say that the social sciences have developed a serious interest in 13833
Self: History of the Concept questions of human selfhood only late, systematically only in the early twentieth century. Furthermore, they have largely ignored ways of representing the human self that were proposed in other areas, in philosophy for instance, but maybe even more importantly in literature. Thus, issues such as individuality and subjectivity, the possible idiosyncrasy of a life-course and life-project, have long remained outside the focus of the social sciences, which have tended to look at the question from the perspective of a fully developed, stable, personal identity, rather than making the continuity and coherence of the self an issue open to investigation. This focus can hardly be understood otherwise than through the perceived need to expect a political stability of the world on the basis of a presupposed coherence of human orientations and actions (see Freedom\Liberty: Impact on the Social Sciences). The ground for an interest in such broader issues was prepared outside of what is conventionally recognized as social science, namely by Friedrich Nietzsche, and later by Sigmund Freud. Nietzsche radically rejected the problems of moral and political philosophy and thus liberated the self from the impositions of the rules of the collective life. Freud located the drives toward a fuller realization of one’s self in the human psyche and connected the history of civilization with the repression of such drives. Against such a ‘Nietzschean-Freudian’ background (Rorty 1989), Georg Simmel and George Herbert Mead could observe the ways in which identities are formed in social interaction and conceptualize variations of selfformation in different social contexts. From then on, a sociology and social psychology of selfhood and identity has developed which no longer relies on presuppositions about some essence of human nature and is able to connect its findings to both child psychology and phenomenology. It emphasizes the socially constructed nature of selfhood, but remains capable, at least in principle, to analyze the specific social contexts of self-formation thus working towards a comparative-historical sociology of selfhood. This broadened debate on selfhood has created a semantic space in which aspects of the self are emphasized in various ways (Friese 2001). This space can be described by connecting the concept of self to notions of modernity, of meaning, and of difference.
2. Selfhood and Modernity The idea of human beings as autonomous subjects is often taken to be characteristic of modernity as an era, or at least for the self-understanding of modernity as an ethos (see Modernity: History of the Concept). Such a view entails a conception of the self as rather continuous and coherent, since only on such a basis is the choice of a path of action and a way of life as well as the acceptance of the responsibility for one’s deeds conceivable. The concept of selfhood is here con13834
ditioned by the need to maintain a notion of human autonomy and agentiality as a basic tenet of what modernity is about, namely the possibility to shape the world by conscious human action. Such modernism, however, is not necessarily tied to the atomist and rationalist individualism of some versions of economic and political thought, most notably neoclassical economics and rational choice theory. In the light of the twentieth century developments in sociology and (social) psychology, as mentioned above, the view that connects selfhood to modernity mostly—in all its more sophisticated forms—starts out from an assumption of constitutive sociality of the human being. Not absolute autonomy, but rather the conviction that human beings have to construct their self-identities can then be seen as characteristically modern (Hollis 1985). Unlike the concept of the rational, autonomous self, this concept is open towards important qualifications in terms of the corporeality, situatedness, and possible nonteleological character of human action (Joas 1996). Without having to presuppose the self-sustained individual of modernity, the aim is to demonstrate how autonomous selves develop through social interactions over certain phases of the life-course (Joas 1998, Straub 2001). The commitment to autonomy and agentiality, characteristic of modernity, becomes visible rather in the fact that formation of self (or, of self-identity) is here understood as the forming and determining of the durably significant orientations in a life. It is related to the formation of a consciousness of one’s own existence and thus biographically predominantly to the period of adolescence. Crises of identity occur accordingly during growing up; more precisely one should speak of life crises during the formation of one’s identity. Self-identity once constituted is seen as basically stable further on. No necessary connection is presupposed between self-formation and individuality; theoretically, human beings may well form highly similar identities in great numbers. Since the very concept of identity is connected to continuity and coherence of self, however, stability is turned into a conceptual assumption. Objections against such a conceptualization go in two different directions. On the one hand, this discourse, which often has its roots in (social) psychology, stands in a basic tension to any culturalist concept of identity, which emphasizes meaning. On the other hand, doubts about the presupposition of continuity and coherence of selfhood employ notions of difference and alterity that are not reducible to the idea that selves are formed by relating to others, by intersubjectivity.
3. Selfhood and Meaning Some critics of the close connection between selfformation and modernity argue that every form of selfhood is dependent on the cultural resources that
Self: History of the Concept are at hand to the particular human being when giving shape to their important orientations in life. Human beings give meaning to their lives by interpreting their situations with the help of moral-cultural languages that precede their own existence and surround them. Cultural determinism is the strong version of such theorizing, mostly out of use nowadays, but many current social theories adopt a weaker version of this reasoning which indeed sustains the notion of the continuity and coherence of the self but sees this self as strongly embedded in cultural contexts. Such a view of the self has its modern source in romanticism (see Romanticism: Impact on Social Thought). Philosophically, it emerged as a response to the rationalist leanings of the Enlightenment, and politically, against the conceptions of abstract freedom in individualist liberalism. Unlike cultural determinism, however, which has a strongly oversocialized conception of the human being and thus hardly any concept of self at all, romanticism emphasizes agency and creativity in the process of self-formation and self-realization. Human beings are seen in their singularity, but the relation to others is an essential and inescapable part of their understanding of their own selves. Charles Taylor’s inquiry into The Sources of the Self is a most recent and forceful restatement of this conception. Taylor starts out from the familiar argument that the advent of modernity indicates that common frameworks for moral evaluation can no longer be presumed to exist. The key subsequent contention is then that the ability and inclination to question any existing such framework of meaning does not lead into a sustainable position that would hold that no such frameworks are needed at all, a view he calls the ‘naturalist supposition’ (Taylor 1989, p. 30). If such frameworks of meaning are what gives human beings identity, allows them to orient themselves in social and moral space, then they are not ‘things we invent’ and may as well not invent. They need to be seen as ‘answers to questions which inescapably pre-exist for us,’ or, in other words, ‘they are contestable answers to inescapable questions’ (Taylor 1989, pp. 38, 40). Taylor develops here the contours of a concept of inescapability as part of a moral-social philosophy of selfhood under conditions of modernity. Meaning-centered conceptions of selfhood have recently been underlined as a basis of communitarian positions in moral and political philosophy, such as Sandel’s concept (1982) of the ‘encumbered self.’
4. Selfhood and Otherness Arguably, these two concepts of selfhood remain within the frame of a debate in which the under- and the oversocialized views of the human being occupy the extreme points. The introduction of interaction
and intersubjectivity in self-formation has created intermediate theoretical positions, and, possibly more importantly, they have allowed different accentuations of selfhood without making positions mutually incompatible. Nevertheless, the modernity-oriented view still emphasizes a self that acts upon the world, whereas the meaning-oriented view underlines the fact that the self is provided the sense of their actions by the world of which they are a part. Since the mid-1970s, in contrast, theoretical and empirical developments have tended to break up the existing two-dimensional mode of conceptualization. On the one hand, the notion of a ‘decentering of the subject’ has been proposed mainly from poststructuralist discussions. On the other hand, the observation of both multiple and changing basic orientations in human beings has led to the proliferation of the—infelicitously coined—term ‘postmodern identity’ as a new form of selfhood. In both cases, a strong concept of self has been abandoned, in the one case on the basis of philosophical reflection, in the other, grounded on empirical observation. Both the ideas of a ‘decentring’ and of a ‘postmodern self’ question some major implications of the more standard sociological view of selfhood, namely the existence of the human self as a unit and its persistence as the ‘same’ self over time. In the former perspective, the philosophical maxim of thinking identity and difference as one double concept, rather than as one concept opposed to another, is translated into sociological thinking as the need to think self and other as a relation, rather than as a subject and a context or object. Such notions rather underline the nonidentitarian character of being by pointing to the issue of ‘the other in me’ (Emmanuel Le! vinas) as a question of co-constitution rather than of interaction. While emphasized in recent poststructuralist thought, similar ideas can be found, for instance, in Arendt’s (1978, pp. 183–7) insistence on the ‘two-in-one,’ on the relation to oneself as another, as the very precondition for thought. In a broad sense, they go back to the ancient view of the friend as an ‘other self.’ And Theunissen (1977\1965) had already conceptualized self-other relations on the basis of an understanding of ‘the Other’ as referring to all ‘concepts through which philosophy interprets the presence and present of the fellow\co-human being (Mitmensch) or of the transcendental original form of this human being.’ In one particular formulation, Cavell’s reflections provide an example for a thinking about selfhood that does not presuppose an idea of identity, coherence, or consistency (Cavell 1989, 1990). He cautions against ‘any fixed, metaphysical interpretation of the idea of a self’ and against the idea of ‘a noumenal self as one’s ‘‘true self’’ and of this entity as having desires and requiring expression.’ In contrast, Cavell suggests that the ‘idea of the self must be such that it can contain, let us say, an intuition of partial compliance with its idea 13835
Self: History of the Concept of itself, hence of distance from itself’ or, in other words, he advocates the idea of ‘the unattained but attainable self’ (Cavell, 1990, pp. 31, 34). Cavell proposes here a relation of the unattained and the attainable as constitutive for the self, that is, he makes the very question of attainability a central feature of a theory of selfhood.
5. Selfhood and Sociohistorical Transformations The development of this threefold semantic space has considerably enriched the conceptualization and analysis of selfhood in the social sciences during the twentieth century. If one considers the three fields of inquiry as explorations of the various dimensions of selfhood, rather than as mutually exclusive perspectives, the question of the constitution of the human self emerges as a probleT matique that concerns all human beings and for which there may be a variety of processes which can be determined generally only in their forms but not in their results. As a consequence, however, it becomes much more difficult to conclude from the prevailing character of selfhood on social structure and political order. This has often been seen as a ‘weakness’ of symbolic interactionism, for instance, which Mead’s view of the self is said to have inaugurated, as a social theory that allegedly cannot address issues of societal constitution. But by the same move Mead allows for and recognizes a plurality of selves that has returned to the center of discussion today—after the renewal of a regressive synthesis of identity and society in Talcott Parsons’s work (to which—what is often overlooked—Erik Erikson contributed as well). The question of the relation between selfhood and society and politics thus needs to be rephrased as a question of comparative-historical sociology rather than merely of social theory. Some indications as to how this relation should be understood can be found when reading the historical sociology of twentieth century societies in this light. The social upheaval during the second half of the nineteenth century with industrialization, urbanization and the phrasing of ‘the social question’ is often seen as a first ‘modern’ uprooting of established forms of selfhood, as a first massive process of ‘disembedding’ (Giddens 1990). The development towards so-called mass societies during the first half of the twentieth century lets the question of the relation between individuation and growth of the self emerge. Totalitarianism has been analyzed in terms of an imbalance between imposed individuation and delayed self-formation, the result having been the tendency towards ‘escape from freedom’ and into stable collective identities (Fromm 1941, Arendt 1958). The first three decades after the Second World War are then considered as a form of re-embedding of selves into the institutional frames of democratic welfare societies. Most recently, the indications of dissolution and dismantling of the rather comprehensive set of 13836
social institutions of the interventionist welfare state are one of the reasons to focus sociological debate again on questions of selfhood and identity. During this period, as some contributions argue, no longer continuity and coherence but transience, instability, and inclination to change are said to be marks of the important life orientations of contemporary human beings (see Shotter and Gergen 1989, Lash and Friedman 1992, Kellner 1995). However, many of these analyses are challenged on grounds of the limited representativity of the empirical sample or on conceptual grounds. Given the theoretical insights described above as the emergence of the threefold semantic space of selfhood, any attempt to offer a full-scale reformulation of the issue of the varieties of selfhood in different sociohistorical configurations would be adventurous. Forms of self-constitution and substantive orientations of human selves are just likely to be highly variable across large populations with varieties of experiences and sociocultural positions. However, the notion of recurring crises of modernity and the identification of historically distinct processes of disembedding and re-embedding could be the basis for a socially more specific analysis of the formation and stability of social selves (Wagner 1994). The social existence of the ‘modern’ idea that human beings construct their selves is what many societies have in common throughout the 1800s and 1900s. As such, it does not give any guidance in defining different configurations. Therefore, three qualifying criteria have been introduced. First, the existence of the idea of construction of selfhood still leaves open the question whether all human beings living in a given social context share it and are affected by it. The social permeation of the idea may be limited. Second, human beings in the process of constructing their selves may consider this as a matter of choice, as a truly modernist perspective would have it. In many circumstances, however, though a knowledge and a sense of the fact of social construction prevails, it may appear to human beings as almost natural, as in some sense pre-given or ascribed, which social self they are going to have. Third, the stability of any self one has chosen may vary. Such a construction of selfhood may be considered a once-in-a-lifetime occurrence, but may also be regarded as less committing and, for instance, open to reconsideration and change at a later age. These criteria widen the scope of constructability of selfhood. All conditions of construction have existed for some individuals or groups at any time during the past two centuries in the West. But the widening of the scope of construction may mark the transitions from one to another social configuration of modernity. These transitions entail social processes of disembedding and provoke transformations of social selves, in the course of which not only other selves are acquired but the possibility of construction is also more widely perceived.
Self-knowledge: Philosophical Aspects See also: Culture and the Self (Implications for Psychological Theory): Cultural Concerns; Identity and Identification: Philosophical Aspects; Identity in Anthropology; Identity: Social; Interactionism and Personality; Modernity; Person and Self: Philosophical Aspects; Personality and Conceptions of the Self; Self: Philosophical Aspects
Bibliography Arendt H 1958 The Origins of Totalitarianism, 2nd edn. World Publishing Company, Cleveland, OH, and Meridian Books, New York Arendt H 1978 The Life of the Mind, Vol. 1: Thinking. Harcourt, Brace, Jovanovich, New York Cavell S 1989 The New yet Unapproachable America. Lectures After Emerson After Wittgenstein. Living Batch Press, Albuquerque, NM Cavell S 1990 Conditions Handsome and Unhandsome: The Constitution of Emersonian Perfectionism. The University of Chicago Press, Chicago Friese H 2001 Identity: Desire, name and difference. In: Friese H (ed.) Identities. Berghahn, Oxford, UK Fromm E 1941 Escape from Freedom. Holt, Rinehart, and Winston, New York Giddens A 1990 The Consequences of Modernity. Polity, Cambridge Hollis M 1985 Of masks and men. In: Carrithers M, Collins S, Lukes S (eds.) The Category of the Person: Anthropology, Philosophy, History. Cambridge University Press, Cambridge, UK and New York Joas H 1996 The Creatiity of Action [trans. by Gaines J, Keast P]. Polity, Cambridge Joas H 1998 The autonomy of the self. The meadian heritage and its postmodern challenge. European Journal of Social Theory 1(1): 7–18 Kellner D 1995 Media Culture, Cultural Studies, Identity and Politics Between the Modern and the Postmodern. Routledge, London and New York Lash S, Friedman J (eds.) 1992 Modernity and Identity. Blackwell, Oxford, UK Rorty R 1989 Contingency, Irony, and Solidarity. Cambridge University Press, Cambridge, UK Sandel M J 1982 Liberalism and the Limits of Justice. Cambridge University Press, Cambridge, UK and New York Shotter J, Gergen K J (eds.) 1989 Texts of Identity. Sage, London Straub J 2001 Personal and collective identity. A conceptual analysis. In: Friese H (ed.) Identities. Berghahn, Oxford, UK Taylor C 1989 Sources of the Self. The Making of Modern Identity. Harvard University Press, Cambridge, MA Theunissen M 1977\1965 Der Andere. de Gruyter, Berlin Wagner P 1994 A Sociology of Modernity: Liberty and Discipline. Routledge, London and New York Wrong D H 1961 The oversocialized conception of man in modern sociology. American Sociological Reiew 26: 183–93
P. Wagner
Self-knowledge: Philosophical Aspects ‘Self-knowledge’—here understood as the knowledge a creature has of its own mental states, processes, and dispositions—has come to be, along with consciousness, an increasingly compelling topic for philosophers and scientists investigating the structure and development of various cognitive capacities. Unlike consciousness, however, the capacity for self-knowledge is generally held to be the exclusive privilege of human beings—or, at least, of creatures with the kind of conceptual capacity that comes with having a language of a certain rich complexity. Prima facie, self-knowledge involves more than simply having sensations (or sensory experiences). Many say it involves even more than having moods, emotions, and ‘first-order’ intentional states, paradigmatically beliefs and desires, but also hopes, fears, wishes, occurrent thoughts, and so on. Some philosophers disagree, claiming that firstorder intentional states (individuated by ‘propositional content’) are themselves properly restricted to language-using creatures and, by that token, to selfknowing creatures. Nevertheless, it is generally agreed that self-knowledge involves, minimally, the capacity for a conceptually rich awareness or understanding of one’s own mental states and processes, and hence the capacity, as it is often put, to form ‘second-order’ beliefs about these, a capacity that can be extended to, or perhaps only comes with, forming beliefs about the mental states and processes of others. More generally, then, self-knowledge involves the capacity for adopting what Daniel Dennett calls the intentional stance (Dennett 1987)—understanding oneself (and others) as having experiences and ways of representing the world that shape agential behavior. But this capacity should not be understood in purely spectatorial terms. The more sophisticated a creature’s intentional understanding, the more sophisticated it will be at acting in a world partially constituted by agential activity. Thus, the capacity for intentional understanding continually plays into, and potentially modifies, an agent’s own intentionally directed activity, a point I will return to below.
1. Empirical Findings: Testing the Limits of Intentional Understanding A number of experiments have been designed to test for the presence of such a capacity in nonhuman animals (primarily hominids) and very young children, with interestingly controversial results. Some investigators claim that the evidence for intentional understanding (including a fairly sophisticated awareness of self ) is persuasively strong in the case of certain primates (chimpanzees and bonobos, though for a deflationary interpretation of this evidence based on 13837
Self-knowledge: Philosophical Aspects further empirical trials, see Povinelli and Eddy (1996)). Experiments with young children (most famously the ‘false-belief task’) seem clearly to indicate that intentional understanding comes in degrees, passing through fairly discernible stages. How precisely to characterize these stages and what constitutes the mechanism of change is hotly debated (Astington et al. 1988), yet there is an emerging pattern in the empirical data that is noteworthy for standard philosophical discussions of self-knowledge. First, under certain, not very unusual conditions, young children will make mistakes in attributing basic mental states to themselves (desires and beliefs) that are inconceivable from the adult point of view because of the direct firstpersonal awareness adults seem to have of these states. Second, the errors children make in self-attribution mirror in kind and frequency the errors they make in attributing mental states to others; there seems to be no dramatic asymmetry between self-knowledge and knowledge of other minds, at least for children (Flavell et al. 1995, Gopnik 1993). Moreover, though children become increasingly more sophisticated in their intentional self- and other-attributions, this symmetry and systematicness in kind and frequency of error seems to persist in notable ways into adulthood, with subjects ‘confabulating’ explanations for their actions and feelings that are purportedly based on direct introspective awareness of their own minds (Nisbett and Wilson 1977). To psychologists investigating these phenomena, this pattern of results suggests that, for both children and adults, judgments about mental states and processes are affected (perhaps unconsciously) by background beliefs about how such states are caused and how they in turn cause behavior, and these beliefs can be systematically wrong, at least in detail. There is nothing special about the capacity for self-knowledge that avoids the error-causing beliefs embedded in our ‘folk-psychology.’
2. The Philosophical Project: Explaining the Special Character of Self-knowledge Although judgments about our own mental states and processes are clearly not infallible, there are distinctive features of self-knowledge that require explanation. These features, and the interrelations among them, are variously presented in the philosophical literature, but they are generally held to consist in the following (Wright et al. 1998, pp. 1–2). (a) Immediacy—the knowledge we have of our own sensations, emotions, and intentional states is normally not based on behavioral or contextual evidence; it is groundless or immediate, seemingly involving some kind of ‘special access’ to our own mind. (b) Saliency (or, transparency)—knowledge of our own minds, especially with regard to sensations and occurrent thoughts, is unavoidable; if we have a 13838
particular thought or sensation, we typically know that we have it (such mental states are ‘self-intimating’). (c) (First-person) authority—judgments about our own minds (expressed in sincere first-person claims) are typically treated by others as correct simply by virtue of our having made them. Normally, such claims require no justification and are treated as true, so long as there is no overwhelmingly defeating counter-evidence to suggest that we are wrong about ourselves. The traditional account against which most discussions of self-knowledge react is drawn from Rene! Descartes (Descartes 1911). Viewing the mind as a non-physical substance directly knowable only from the first-person point of view, Descartes had a ready explanation of first-person authority: we clearly and distinctly perceie the contents of our own minds, which are by their nature transparent to us. Alternatively, since we have no special perceptual access to the minds of others, the only way we can make judgments about them is by observing their external bodily movements and inferring from their similarities to ours that they are likewise caused by mental states and processes. Such weakly inferred third-person judgments are perforce dramatically less secure than first-person judgments, which are themselves infallible. On this view, the commonly acknowledged asymmetry between first- and third-person judgments is exaggerated in both directions. This Cartesian account has been discredited in almost everyone’s estimation—and not just empirically, but conceptually. Even the picturesque metaphor of the ‘mind and its contents’ was held by Gilbert Ryle to encourage such bad habits of thought that nothing worthwhile could be said in its defense. The ‘metaphysical iron curtain’ it introduced between knowledge of self and knowledge of others was, in Ryle’s view, sufficient to show the bankruptcy of the Cartesian picture by making impossible our everyday practices of correctly applying, and being able to correct our applications of, ‘mental conduct’ concepts either to others or to ourselves. Since this would undermine the possibility of a coherent language containing such concepts, it undermines the possibility of our conceptualizing ourselves intentionally at all. Thus, so far from providing an explanation of our capacity for self-knowledge, Descartes’ theory discredits it. We find out about ourselves in much the same way that we find out about other people, Ryle is famous for saying: ‘A residual difference in the supplies of the requisite data makes some differences in degree between what I can know about myself and what I can know about you, but these differences are not all in favor of self-knowledge’ (Ryle 1949, p. 155). Although Ryle exhaustively illustrated his view that the ways we find out about ourselves are as various as the kinds of things we find out, thus debarring any simple philosophical analysis of self- and other-knowledge, he is
Self-knowledge: Philosophical Aspects generally thought to have promoted the doctrine of ‘logical behaviorism’ (not Ryle’s term), according to which all talk about so-called ‘mental’ states and processes (our own or anyone else’s) is really just talk about actual and potential displays of overt behavior. In fact, though clearly suspicious of philosophical defenses of first-person authority based on a principle epistemic privilege, Ryle is more profitably read as following Wittgenstein in trying to reveal how the ‘logical (or ‘grammatical) structure’ of a pervasive way of thinking about the ‘mind and its contents’ inevitably generates self-defeating explanations of various phenomena, including our capacity for self-knowledge. Any substantial account of such phenomena must therefore exemplify a different kind of logical structure altogether (see discussions of Wittgenstein’s view in McDowell 1991, Wright 1991).
3. Current Proposals: ‘Causal–Perceptual’ s. ‘Constitutie’ Explanations of Self-knowledge This proposed opposition between different ‘logical types’ of explanation is given more substance in contemporary work with philosophers pursuing two quite different methodological approaches to understanding the purported immediacy, saliency, and authority of self-knowledge. The first ‘causal–perceptual’ approach focuses on the mechanics of selfknowledge, including the nature of the states we are purportedly making judgments about, the nature of the states that constitute those judgments, and the nature of the connection between them. The second so-called ‘constitutive’ approach focuses on the larger implications of our capacity for self-knowledge, including what role it plays in specifying conditions for the possibility of our powers and practices of agency. Adopting either one of these approaches does not rule out addressing the questions featured by the alternative approach, although the sorts of answers given are materially shaped by which questions occupy center field.
3.1 Causal–Perceptual Accounts of Self-knowledge According to some philosophers, the basic defects of the Cartesian view stem not from the idea that firstperson knowledge is a form of inner perception, but from its dualistic metaphysics and an overly restricted conception of what perception involves. On Descartes’s ‘act-object’ model of perception, mental states are objects before the mind’s ‘inner eye’ which we perceive (as we do external objects) by the act of discerning their intrinsic (nonrelational) properties. Without mounting any radical ‘grammatical’ critique, problems with this view can be readily identified,
especially once the mind is understood to be part of the material world (see Shoemaker 1996, Wright et al. 1998). (a) There is no organ of inner perception analogous to the eye (but we do have a capacity for sensing the position of our bodies, by proprioception, and this does not require any specialized organ; perhaps introspection is like proprioception). (b) Sensations could possibly count as ‘objects’ of inner perception, but intentional states (beliefs and desires) are less plausible candidates. First, phenomenologically, we do not seem to be aware of our intentional states as discernibly perceptible in the way that sensations are; there is no perceptual experience that precedes, or comes with, our judgments about such states—no ‘qualia.’ Second, intentional states are individuated by their propositional content, and their propositional content is plausibly determined by (causal) relational features these states bear to the subject’s physical or social environment, rather than intrinsic features of the states themselves. If the ‘externalist’ view of content is right, then authoritative self-knowledge of these states cannot be based on inwardly perceiving their intrinsic properties (see Ludlow and Martin 1998). (c) In contrast with external perception, there are no mediating perceptual experiences between the object perceived and judgments based on those perceivings. If mental states and processes are objects of inner perception, they must be ‘self-intimating’ objects, where the object of perception is itself a perceptual experience of that object. Taken literally, this idea is hard to make coherent, though some have found it invitingly appropriate in the case of sensations as a way of marking what is so unusually special and mysterious about them. Many of these problems (and more) are satisfactorily addressed by adopting a ‘tracking’ model of inner perception (Armstrong 1968). We do have ‘special access’ to our own minds in so far as the second-order beliefs we form about our (ontologically distinct) first-order states and processes are reliably caused, perhaps subcognitiely, by the very first-order states and processes such beliefs are about. There need be no distinctive phenomenology that characterizes such access, and no need therefore to give different accounts of our self-knowledge of intentional states versus sensations, imaginings, and other phenomenologically differentiated phenomena. Of course, if all such states are functionally defined—that is, dispositionally, in terms of their causal relations to other mental states, to perceptual inputs and to behavioral outputs—there is nothing about them that makes them in principle accessible only to the person whose states they are. But this is a strength of the approach. Such states are accessible in a special way to the subject (by normally and reliably causing the requisite secondorder beliefs), thereby accounting for the immediacy, saliency, and authority of our first-person judgments. 13839
Self-knowledge: Philosophical Aspects Yet because causal mechanisms can break down in particular cases, first-person error may be expected (as in self-deception). Such errors may also be detected: since first-order states are constitutively related to a subject’s linguistic and nonlinguistic behavior, significant evidence of noncohesiveness between the subject’s sayings and doings may be used to override her selfascriptions. Moreover, there is no inconsistency between this account of the special character of selfknowledge and psychologists’ findings of systematic error in self- and other attribution. It may be just as they suggest: since the concepts in terms of which we express our second-order beliefs (about self and other) are functionally defined in terms of folk-psychological theory, systematic error may result from the theory’s deficiencies.
3.2 Constitutie Accounts of Self-knowledge The causal–perceptual approach to self-knowledge has many advantages, including, it would seem, the ontological and conceptual independence of first-order states from the second-order beliefs that track them in self-knowers. For this means there is no conceptual bar on attributing beliefs, desires, and so forth to creatures that lack the capacity for self-knowledge but still behave rationally enough to be well predicted from the intentional stance. Why then do some philosophers insist that having such first-order intentional states is constitutively linked to having authoritative self-knowledge—and, therefore, presumably, to a subject’s forming second order beliefs about her first order states for which ‘truth is the default condition’ (Wright 1991)? To begin with a negative answer: if causal–perceptualists are right in saying that the subject’s status as an authoritative self-knower is merely contingently dependent on a reliable causal mechanism linking first- and second-order states, then we could not account for the special role self-knowledge plays in constituting that subject as a rational and responsible agent (Bilgrami 1998, Burge 1996, Shoemaker 1996 ). How so? Consider, for instance, the causal–perceptualist account of error in self-attribution: The subject forms a false second-order belief due to a breakdown in the causal mechanism normally linking that belief to the appropriate first-order state. Her lapse of rationality consists in her saying one thing (because of the second-order belief ) and doing another (because of her first-order states). But why count this a lapse of rationality (as surely we do), rather than simply a selfmisperception, analogous (except for the causal pathway) to misperceptions of others? What imperative is there for a subject’s self-attributions to line up with her other sayings and doings in order for her to be a coherent subject, responsible and responsive to the norms embedded in our folk-psychological conception of rational agency? 13840
Different authors take various tacks in answering this question, but there is a discernible common thread. An agent cannot act linguistically or nonlinguistically in ways that respond self-consciously to these norms, unless she knows in a directive sense what she is doing—i.e., unless she recognizes herself as acting in accord with certain intentional states that she knows to be hers, and knows in a way that acknowledges her own agency in producing and maintaining them. Moreover, an agent is self-consciously responsive to norms that govern her own intentional states (such as belieing that p), not just by recognizing what counts as norm-compliant behavior, but actually by regulating herself (her judgments, thoughts, and actions) in accord with such norms. On this picture, if a person’s sincere claims about herself do not cohere with the (other) things she does and says, we may say she fails to know herself. But, just as importantly, we say that she acts unknowingly. That is, there is some sense (perhaps pathological) in which she acts, but without authoring her own acts as a free and responsible agent. Hence, she does not authorize such acts; they are out of her reflective control. To fail in self-knowledge on this account is not to be wrong in the passively spectatorial sense that one can be wrong about others; it is to fail more actively in exercising one’s powers of agency. Constitutive explanations of self-knowledge display a very different logical structure from causal–perceptual accounts. Consequently, they suggest a different way of conceptualizing the intentional capacities of self-knowers, a way that has become needlessly complicated through retaining the traditional architecture of first-order states and second-order beliefs about them that are ontologically, if not conceptually, distinct and that supposedly underlie and get expressed in first-person claims (McGeer 1996). This distinction makes sense on a causal–perceptualist account of selfknowledge; but it only confuses and obscures on the constitutive account, leading, for instance, to inappropriate descriptions of what distinguishes us from other animals. The constitutive explanation of selfknowledge focuses on what it means to be an intentional creature of a certain sort, an intentional creature that is capable of actively regulating her own intentional states, revising and maintaining them in selfconscious recognition of various norms. No doubt there are automatic mechanisms also involved in regulating intentional states in an environmentally useful way, mechanisms we may well share with nonlinguistic creatures and which explain why adopting the intentional stance towards them works as well as it does (Shoemaker 1996, essay 11). But in our own case, we increase the range and power of the intentional stance by understanding what rational, responsible agency requires of us and molding ourselves to suit. The apparatus of first- and second-order states may play some useful role in further elucidating this capacity for intentional self-regulation, but more likely
Self-monitoring, Psychology of it encourages bad habits of thought—a lingering vestige of the dying, but never quite dead, Cartesian tradition in epistemology and philosophy of mind. See also: Culture and the Self (Implications for Psychological Theory): Cultural Concerns; Person and Self: Philosophical Aspects; Self-development in Childhood; Self: History of the Concept; Self: Philosophical Aspects
Bibliography Armstrong D M 1968 A Materialist Theory of the Mind. Routledge and Kegan Paul, London Astington J W, Harris P L, Olson D R (eds.) 1988 Deeloping Theories of Mind. Cambridge University Press, Cambridge, UK Bilgrami A 1988 Self-knowledge and resentment. In: Wright C, Smith B C, Macdonald C (eds.) Knowing our Own Minds. Clarendon Press, Oxford, UK Burge T 1996 Our entitlement to self-knowledge. Proceedings of the Aristotelian Society 96: 91–111 Dennett D C 1987 The Intentional Stance. MIT Press, Cambridge, MA Descartes R 1911 Meditations on first philosophy. In: Haldane E S, Ross G R T (eds.) The Philosophical Works of Descartes. Cambridge University Press, Cambridge, UK, Vol. 1, pp. 131–200 Flavell J H, Green F L, Flavell E R 1995 Young Children’s Knowledge About Thinking. University of Chicago Press, Chicago Gopnik A 1993 How we know our minds: The illusion of firstperson knowledge of intentionality. Behaioral and Brain Sciences 16: 1–14 Ludlow P, Martin N (eds.) 1998 Externalism and Selfknowledge. Cambridge University Press, Cambridge, UK McDowell J 1991 Intentionality and interiority in Wittgenstein. In: Puhl K (ed.) Meaning Scepticism. W. de Gruyter, Berlin, pp. 168–9 McGeer V 1996 Is ‘Self-knowledge’ an empirical problem? Renegotiating the space of philosophical explanation. Journal of Philosophy 93: 483–515 Nisbett R E, Wilson T D 1977 Telling more than we can know: Verbal reports on mental processes. Psychological Reiew 84(3): 231–59 Povinelli D J, Eddy T J 1996 What young chimpanzees know about seeing. Monographs of the Society for Research in Child Deelopment 61(3): 1–190 Ryle G 1949 The Concept of Mind. Hutchinson’s University Library, London Shoemaker S 1996 The First Person Perspectie and Other Essays. Cambridge University Press, Cambridge, UK Wright C 1991 Wittgenstein’s later philosophy of mind: Sensation, privacy and intention. In: Puhl K (ed.) Meaning Scepticism. W. de Gruyter, Berlin, pp. 126–47 Wright C, Smith B C, Macdonald C (eds.) 1998 Knowing Our Own Minds. Clarendon Press, Oxford, UK
V. McGeer
Self-monitoring, Psychology of According to the theory of self-monitoring, people differ in the extent to which they monitor (i.e., observe and control) their expressive behavior and self-presentation (Snyder 1974, 1987). Individuals high in selfmonitoring are thought to regulate their expressive self-presentation for the sake of public appearances, and thus be highly responsive to social and interpersonal cues to situationally appropriate performances. Individuals low in self-monitoring are thought to lack either the ability or the motivation to regulate their expressive self-presentations for such purposes. Their expressive behaviors are thought instead to reflect their own inner states and dispositions, including their attitudes, emotions, selfconceptions, and traits of personality. Research on self-monitoring typically has employed multi-item, self-report measures to identify people high and low in self-monitoring. The two most frequently employed measuring instruments are the 25 true–false items of the original Self-monitoring Scale (Snyder 1974) and an 18-item refinement of this measure (Gangestad and Snyder 1985; see also Lennox and Wolfe 1984). Empirical investigations of testable hypotheses spawned by self-monitoring theory have accumulated into a fairly sizable published literature (for a review of the literature, see Gangestad and Snyder 2000).
1. Major Themes of Self-monitoring Theory and Research Soon after its inception, and partially in response to critical theoretical issues of the times, self-monitoring was offered as a partial resolution of the ‘traits vs. situations’ and ‘attitudes and behaviors’ controversies in personality and social psychology. The propositions of self-monitoring theory suggested that the behavior of low self-monitors ought to be predicted readily from measures of their attitudes, traits, and dispositions whereas that of high self-monitors ought to be best predicted from knowledge of features of the situations in which they operate. Self-monitoring promised a ‘moderator variable’ resolution to debates concerning the relative roles of person and situation in determining behavior. These issues set the agenda for the first generation of research and theorizing on selfmonitoring, designed primarily to document the relatively ‘situational’ orientation of high self-monitors and the comparatively ‘dispositional’ orientation of low self-monitors (for a review, see Snyder 1987). In a second generation of research and theorizing, investigations moved beyond issues of dispositional and situational determination of behavior to examinations of the linkages between self-monitoring and 13841
Self-monitoring, Psychology of interpersonal orientations. Perhaps the most prominent of these programs concerns the links between expressive control and interpersonal orientations, as revealed in friendships, romantic relationships, and sexual involvements (e.g., Snyder et al. 1985). Other such programs of research concern advertising, persuasion, and consumer behavior (e.g., Snyder and DeBono 1985), personnel selection (e.g., Snyder et al. 1988), organizational behavior (Caldwell and O’Reilly 1982, Kilduff 1992), socialization and developmental processes (e.g., Eisenberg et al. 1991, Graziano and Waschull 1995), cross-cultural studies (e.g., Gudykunst 1985). Central themes in these programs of research have been that high self-monitors live in worlds of public appearances created by strategic use of impression management, and that low self-monitors live in worlds revolving around the private realities of their personal identities and the coherent expression of these identities across diverse life domains. Consistent with these themes, research on interpersonal orientations has revealed that high, relative to low, self-monitors choose as activity partners friends who will facilitate the construction of their own situationally-appropriate images and appearances (e.g., Snyder et al. 1983). Perhaps because of their concern with images and appearances, high self-monitors have romantic relationships characterized by less intimacy than those of low self-monitors. Also consistent with these themes, explorations of consumer attitudes and behavior have revealed that high self-monitors value consumer products for their strategic value in cultivating social images and public appearances, reacting positively to advertising appeals that associate products with status and prestige; by contrast, low selfmonitors judge consumer products in terms of the quality of the products stripped of their image-creating and status-enhancing veneer, choosing products that they can trust to perform their intended functions well (e.g., DeBono and Packer 1991). These same orientations manifest themselves in the workplace as well, with high self-monitors preferring positions that call for the exercise of their self-presentational skills; thus, for example, high self-monitors perform particularly well in occupations that call for flexibility and adaptiveness in dealings with diverse constituencies (e.g., Caldwell and O’Reilly 1982) whereas low self-monitors appear to function best in dealing with relatively homogeneous work groups. It should be recognized that, although these programs of research, for the most part, have not grounded their hypotheses or interpretations in selfmonitoring’s traditionally fertile ground—issues concerning the dispositional vs. situational control of behavior—they do nevertheless reflect the spirit of the self-monitoring construct. That is, their guiding themes represent clear expressions of self-monitoring theory’s defining concerns with the worlds of public appearances and social images, and the processes by 13842
which appearances and images are constructed and sustained. However, it should also be recognized that these lines of research go beyond showing that individual differences, in concern for cultivating public appearances, affect self-presentational behaviors. These programs of research have demonstrated that these concerns, and their manifestations in expressive control, permeate the very fabric of individuals’ lives, affecting their friendship worlds, their romantic lives, their interactions with the consumer marketplace, and their work worlds.
2. The Nature of Self-monitoring Despite their generativity, the self-monitoring construct and its measure have been the subject of considerable controversy over how self-monitoring ought to be interpreted and measured. The roots of this controversy are factor analyses that clearly reveal that the items of the Self-monitoring Scale are multifactorial, with the emergence of three factors being the most familiar product of these factor analyses (Briggs et al. 1980). These factor analyses, and attempts to interpret them, have stimulated a critically important question: Is self-monitoring truly a unitary phenomenon? Although there is widespread agreement about the multifactorial nature of the items of the Self-monitoring Scale, there exist diverging viewpoints on the interpretation of this state of affairs. One interpretation is that some criterion variables represented in the literature might relate to one factor, other criterion variables to a second independent factor, and yet others to still a third factor—an interpretation which holds that self-monitoring is not a unitary phenomenon (e.g., Briggs and Cheek 1986). Without disputing the multifactorial nature of the self-monitoring items, it is nevertheless possible to construe self-monitoring as a unitary psychological construct. Taxonomic analyses have revealed that the self-monitoring subscales all tap, to varying degrees, a common latent variable that may reflect two discrete or quasidiscrete self-monitoring classes (Gangestad and Snyder 1985). In addition, the Self-monitoring Scale itself taps a large common factor accounting for variance in its items and correlating, to varying degrees, with its subscales; this general factor approximates the Self-monitoring Scale’s first unrotated factor (Snyder and Gangestad 1986). Thus, the Selfmonitoring Scale may ‘work’ to predict diverse phenomena of individual and social functioning because it taps this general factor; this interpretation is congruent with self-monitoring as a unitary, conceptually meaningfully psychological construct. Although much of the debate about the nature of the self-monitoring construct has focused on con-
Self-monitoring, Psychology of trasting interpretations of the internal structure of the Self-monitoring Scale, it is possible to consult another source of evidence with which to address the major issues of the self-monitoring controversy—the literature on the Self-monitoring Scale’s relations with criterion variables external to the scale itself (i.e., behavioral, behavioroid, and performance measures of phenomena relevant to self-monitoring theorizing). Based on a quantitative review of the literature on the Self-monitoring Scale’s relations with behavioral and behavioroid external criterion variables, it appears that, with some important exceptions, a wide range of external criteria tap a dimension directly measured by the Self-monitoring Scale (Gangestad and Snyder 2000). Based on this quantitative appraisal of the selfmonitoring literature, it is possible to offer some specifications of what self-monitoring is and what it is not, specifications that may guide the next generations of theory and research on self-monitoring. That is, it is possible to identify ‘exclusionary messages’ about features of self-monitoring theory that should not receive the attention heretofore accorded them (e.g., delimiting the scope of self-monitoring as a moderator variable such that claims about peer-self agreement ought no longer be made, although claims about behavioral variability may yet be made), and to identify ‘inclusionary messages’ about features that should define the evolving agenda for theory and research on self-monitoring (e.g., focusing on the links between self-monitoring and strategic motivational agendas associated with engaging in, or eschewing, impression management tactics that involve the construction of social appearances and the cultivation of images).
3. Conclusions To some extent, the productivity and generativity of the self-monitoring construct may derive from the fact that it appears to capture one of the fundamental dichotomies of psychology—whether behavior is a product of forces that operate from outside of the individual (exemplified by the ‘situational’ orientation of the high self-monitor) or whether it is governed by influences that guide from within the individual (typified by the ‘dispositional’ orientation of the low self-monitor). In theory and research, self-monitoring has served as a focal point for issues in assessment, in the role of scale construction in theory building, and in examining fundamental questions about personality and social behavior, particularly those concerning how individuals incorporate inputs from their own stable and enduring dispositions and inputs from the situational contexts in which they operate into agendas for action that guide their functioning as individuals and as social beings.
See also: Impression Management, Psychology of; Personality and Conceptions of the Self; Selfconscious Emotions, Psychology of; Self-evaluative Process, Psychology of; Self-knowledge: Philosophical Aspects
Bibliography Briggs S R, Cheek J M 1986 The role of factor analysis in the development and evaluation of personality scales. Journal of Personality 54: 106–48 Briggs S R, Cheek J M, Buss A H 1980 An analysis of the selfmonitoring scale. Journal of Personality and Social Psychology 38: 679–86 Caldwell D F, O’Reilly C A 1982 Boundary spanning and individual performance: The impact of self-monitoring. Journal of Applied Psychology 67: 124–27 DeBono K G, Packer M 1991 The effects of advertising appeal on perceptions of product quality. Personality and Social Psychology Bulletin 17: 194–200 Eisenberg N, Fabes R, Schaller M, Carlo G, Miller P 1991 The relation of parental characteristics and practices to children’s vicarious emotional responding. Child Deelopment 62: 1393–1408 Gangestad S, Snyder M 1985 ‘To carve nature at its joints’: On the existence of discrete classes in personality. Psychological Reiew 92: 317–49 Gangestad S, Snyder M 2000 Self-monitoring: appraisal and reappraisal. Psychological Bulletin 126: 530–55 Graziano W G, Waschull S B 1995 Social development and selfmonitoring. Reiew of Personality and Social Psychology 15: 233–60 Gudykunst W B 1985 The influence of cultural similarity, type of relationship, and self-monitoring on uncertainty reduction processes. Communication Monographs 52: 203–17 Kilduff M 1992 The friendship network as a decision-making resource: dispositional moderators of social influences on organization choice. Journal of Personality and Social Psychology 62: 168–80 Lennox R, Wolfe R 1984 Revision of the self-monitoring scale. Journal of Personality and Social Psychology 46: 1349–64 Snyder M 1974 Self-monitoring of expressive behavior. Journal of Personality and Social Psychology 30: 526–37 Snyder M 1987 Public appearances, Public realities: The Psychology of Self-monitoring. W. H. Freeman, New York Snyder M, Berscheid E, Glick P 1985 Focusing on the exterior and the interior: Two investigations of the initiation of personal relationships. Journal of Personality and Social Psychology 48: 1427–39 Snyder M, Berscheid E, Matwychuk A 1988 Orientations toward personnel selection: Differential reliance on appearance and personality. Journal of Personality and Social Psychology 54: 972–9 Snyder M, DeBono K G 1985 Appeals to image and claims about quality: Understanding the psychology of advertising. Journal of Personality and Social Psychology 49: 586–97 Snyder M, Gangestad S 1986 On the nature of self-monitoring: Matters of assessment, matters of validity. Journal of Personality and Social Psychology 51: 125–39
13843
Self-monitoring, Psychology of Snyder M, Gangestad S, Simpson J A 1983 Choosing friends as activity partners: The role of self-monitoring. Journal of Personality and Social Psychology 45: 1061–72
M. Snyder
Self-organizing Dynamical Systems Interactions, nonlinearity, emergence and context, though omnipresent in the social and behavioral sciences, have proved remarkably resistant to understanding. New light may be shed on these problems by virtue of the introduction and development of theoretical concepts and methods of self-organization and the mathematical tools of (nonlinear) dynamical systems (Haken 1996, Kelso 1995). Self-organized dynamics promises both a language for, and a strategy toward, understanding human behavior on multiple levels of description. A key problem on any given level of description is to identify the essential dynamical variables characterizing the formation and change of behavioral patterns so that the pattern dynamics, the rules governing behavior, may be found and their predictions pursued. Dynamics offers not only a concise mathematical description of different kinds and classes of behavior. Dynamical models provide new measures and predict new phenomena often not observed before in studies of individual human and social behavior. And they may help explain effects previously attributed to more static, nondynamical mechanisms. Therein lies the promise of dynamics for the social and behavioral sciences at large. Dynamics offers a new way to think about, perhaps even solve, old problems. Here, following a brief historical introduction, the main ideas of dynamical systems are described in a nontechnical fashion. The connection between notions of self-organization and nonlinear dynamical systems is then addressed as a foundation for understanding behavioral pattern formation, flexibility, and change. A flavor of the approach, which stresses an intimate relationship between theory and experiment, is addressed in the context of a few examples from the behavioral and social sciences. Finally, some extensions of the approach are discussed briefly, including a way to include meaningful information into the selforganizing dynamics.
1. A Short History of Dynamical Systems For the eighteenth century Scottish philosopher David Hume, all the sciences bore a relation to the human mind. In his Treatise of Human Nature (1738), Hume first divided the mind into its contents: ideas and 13844
impressions. Then he added dynamics, noting the impossibility of simple ideas forming more complex ones without some bond of union between them. Hume’s three dynamical laws for the association of ideas—resemblance, contiguity, and cause and effect—were thought to be responsible for controlling all mental operations. A kind of attraction, Hume thought, existed in the mental world: the motion of the mind was conceived as analogous to the motion of a body, as described earlier by Newton. Mental ‘stuff’ was governed (somehow) by dynamics. All this has a strangely contemporary ring to it. Dynamics is not only the language of the natural sciences, but has permeated the social, behavioral, and neurosciences as well (Arbib et al. 1998, Kelso 1995, Vallacher and Novak 1997). Cognitive science, whose theory and paradigms are historically embedded in the language of the digital computer, may be the most recent of the human sciences to fall under the spell of dynamics (Port and van Gelder 1995, Thelen and Smith 1994). Which factors led to the primacy of dynamics as the language of science, whether natural or human? First, was the insight of the French mathematician Henri Poincare! to put dynamics into geometric form. Poincare! introduced an array of powerful methods for visualizing how things behave, somewhat independent of the things themselves. Second, following Poincare! ’s lead, were developments in understanding nonlinear dynamical systems from a mathematical point of view. These are sets of differential equations that can exhibit multiple solutions for the same parameters and are capable of highly irregular, even unpredictable behavior. Third, was the ability to visualize how these equations behave by integrating them numerically on a digital computer. Parameters can be varied and parameter spaces explored without doing a ‘real’ experiment or collecting data of any kind. In fact, computer simulation and visualization is often the only possible method of exploring complex behavior. Finally, were quite specific examples of cooperative (or ‘collective’) behavior in physics (e.g., fluids, lasers), chemistry (e.g., chemical reactions), biology (e.g., morphogenesis), and later, in the behavioral sciences. The behavioral experiments (Kelso 1984) and consequent theoretical modeling (Haken et al. 1985) showed for the first time that dynamic patterns of human behavior, how behavior forms, stabilizes, and changes, may be described quite precisely in the mathematical language of nonlinear dynamical systems. Not only were new effects predicted and observed, it was possible to derive these emergent patterns and the pattern dynamics by nonlinearly coupling the component parts. Since these early empirical and theoretical developments, ideas of self-organization and dynamics have infiltrated the social, behavioral, economic, and cognitive sciences, not to speak of literature and the arts.
Self-organizing Dynamical Systems
2. Dynamical Systems: Some Terminology What exactly is a dynamical system? It is not possible here to do justice to the mathematics of dynamical systems and its numerous technical applications, for example, in time series analysis (e.g., Strogatz 1994). For present purposes, only the basics are considered. Fundamentally, a dynamical system pertains to anything—such as the behavior of a human being or a social group—that evolves over time. Mathematically, if X , X , …, Xn are the variables characterizing the " #behavior, a dynamical system is a system of system’s equations stipulating the temporal evolution of X, the vector of all permissible X-values. Therein lies the rub: in the social and behavioral sciences one has to find these Xs and identify their dynamics on a chosen level of description. If X is a continuous function of time, then the dynamics of X is typically described by a set of first order ordinary differential equations (ODEs). The order of the equation refers to the highest derivative that appears in the equation. Thus, XI l F (X ) is a first order equation where the dot above the X denotes the first derivative with respect to time and F (X ) gives the vector field. ODEs are of greatest interest when F is nonlinear because, depending on the particular form of F, a wide variety of behaviors can arise. Another important class of dynamical system is difference equations or maps. Maps are often useful when the behavior of a system appears to be organized around discrete temporal events. Maps and ODEs provide complementary ways to model and study nonlinear dynamical behavior, but we shall not pursue maps further here (see, however, van Geert 1995 as an example of the approach). A vectorfield is defined at each and every point of the state or phase space of the system, defined by the values of the variables X , X , …, Xn. As these states # evolve in time they can be" represented graphically as a trajectory. The state space is filled with trajectories. At any point in a given trajectory, a velocity vector can be determined by differentiation. These velocity vectors prescribe a velocity vectorfield, which is assumed to represent the flow or behavior of the dynamical system (Abraham and Shaw 1982). This flow often has characteristic tendencies particularly in dissipative systems, named thus because their state space volume decreases or dissipates over time toward attractors. In other words, trajectories converge over time (technically, as time goes to infinity) to subsets of the phase space (a given attractor’s basin of attraction). One may even think of an attractor as a representation of the goal of the behaving system (Saltzman and Kelso 1987). The term transient defines the segment of a trajectory starting from some initial condition in the basin of attraction until it settles into an attractor. Attractors can be fixed points, in which all initial conditions converge to some stable rest state. Attrac-
tors can be periodic, exhibiting preferred rhythms or orbits on which the system settles regardless of where it starts. Or, there can be so-called strange attractors; strange because they exhibit deterministic chaos, a type of irregular behavior resembling random noise, yet often containing pockets of quite ordered behavior. The presence of chaos in physical systems is ubiquitous, and there is some evidence suggesting that it may also play an important role in certain biological systems, including the human brain and its disorders (Kelso and Fuchs 1995). Chaos means sensitive dependence on initial conditions: nudging a chaotic system by just a hair can change its future behavior dramatically. In the vernacular, human behavior certainly seems ‘chaotic’ sometimes. However, although an area of active interest in many research fields, it remains unclear whether chaos, and its close relative, fractals (Mandelbrot 1982) provide any essential insights in the social and behavioral sciences. What may be rather more important to appreciate is that parameters, so-called control parameters, can move a system through a vast array of different kinds of dynamic behavior of which chaos is only one. Thus, when a parameter changes smoothly, the attractor in general also changes smoothly. In other words, sizeable changes in the input have little or no effect on the resulting output. However, when the control parameter passes through a critical point or threshold in an intrinsically nonlinear system an abrupt change in the attractor can occur. This sensitive dependence on parameters is called a bifurcation in mathematics, or a nonequilibrium phase transition in physical theories of pattern formation (Haken 1977).
3. Concepts of Self-organization Dynamical theory, by itself, does not give us any essential insights into how patterns of behavior may be created in complex systems that contain multiple (often different) elements interacting nonlinearly with each other and their environment. This is where the concept of self-organization is helpful (Haken 1977, Nicolis and Prigogine 1977). Self-organization refers to the spontaneous formation of pattern and pattern change in complex systems whose elements adapt to the very patterns of behavior they create. Think of birds flocking, fish schooling, bees swarming. There is no ‘self ’ inside the system ordering the elements, dictating to them what to do and when to do it. Rather, the system, which includes the environment, literally organizes itself. Inevitably, when the elements form a coupled system with the world, coordinated patterns of behavior arise. Emergent pattern reflects the collective behavior of the elements. Collective behavior as a result of selforganization reduces the very large number of degrees of freedom into a much smaller set of relevant 13845
Self-organizing Dynamical Systems dynamical variables called, appropriately enough, collective variables. The enormous compression of degrees of freedom near critical points can arise because events occur on different timescales: the faster individual elements in the system may become ‘enslaved’ to the slower, emergent pattern or collective variables, and lose their status as relevant behavioral quantities (Haken 1977). Alternatively, one may conceive of a hierarchy of timescales for various processes underlying human behavior. On a given level of the hierarchy are pattern variables subject to constraints (e.g., of the task) that act as boundary conditions on the pattern dynamics. At the next level down are component processes and events that typically operate on faster timescales. Notice in this scheme (Kelso 1995), the key is to chose a level of description and understand the relation between adjacent levels, not reduce to some ‘fundamental’ lower level. So what causes behavioral patterns to form? And what causes pattern change? It is here that the connection between processes of self-organization and dynamical systems theory becomes quite explicit. In complex systems that are open to exchanges of information with their environment, naturally occurring environmental conditions or intrinsic, endogenous factors may take the form of control parameters in a nonlinear dynamical system. When the control parameter crosses a critical value, instability occurs, leading to the formation of new (or different) patterns. Dynamic instability is the generic mechanism underlying spontaneous self-organized pattern formation and change in all systems coupled to their internal or external environment. The reason is that near instability the individual elements, in order to adjust to current conditions (control parameter values), must order themselves in a new or different way. Fluctuations are always present, constantly testing whether a given behavioral pattern is stable. Fluctuations are not just noise; rather, they allow the system to discover new, more adaptive behavioral patterns. The patterns that emerge and change near instabilities have a rich and highly nonlinear pattern, or coordination dynamics. Herein lies the basis of the hypothesis that human beings and the basic forms of behavior they produce, may be understood as self-organizing dynamical systems (Kelso 1995). Humans—complex systems consisting of multiple, interacting elements—produce behavioral patterns that are captured by collective variables. The resulting dynamical rules are essentially nonlinear and thus capable of producing a rich repertoire of behaviors. By this hypothesis, all human behavior—even human cognitive development (Magnusson 1995, Thelen and Smith 1994) and learning (Zanone and Kelso 1992) which occur on very different time scales—arises because complex material systems, through the process of self-organization, create a dynamic, pattern forming system that is capable of both behavioral 13846
simplicity (such as fixed point behavior) and enormous, even creative, behavioral complexity (such as chaotic behavior). No new principles, it seems, need be introduced (but see Sect. 5).
4. Finding Dynamical Laws in Behaioral and Social Systems In the social and behavioral sciences the key collective variables characterizing the system’s state space are seldom known a priori and have to be identified. Science always needs special entry points, places where irrelevant details may be pruned away while retaining the essential aspects of the behavior one is trying to understand. Unique to the perspective of self-organized dynamical systems is its focus on qualitative change, places where behavior bifurcates. The reason is that qualitative change affords a clear distinction between one pattern and another, thereby enabling one to identify the key pattern variables that define behavioral states. Likewise, any parameter that induces qualitative behavioral change qualifies as a control parameter. The payoff from knowing collective pattern variables and control parameters is high: they enable one to obtain the dynamical rules of behavior on a chosen level of description. By adopting the same strategy the next level down, the individual component dynamics may be studied and identified. It is the interaction between these that creates the patterns at the level above, thereby building a bridge across levels of description.
5. The Laws that Bind Us It has long been recognized that social relationships are an emergent product of the process of interaction. Commonly studied relationships are the mother– infant interaction, the marriage bond and the patienttherapist relation, all of which are extremely complicated and difficult to understand. A more direct approach may be to study simpler kinds of social coordination, with the aim of determining whether such emergent behavior is self-organized, and if so what its dynamics are. As an illustrative example, consider two people coordinating their movements. In this particular task each individual is instructed to oscillate a limb (the lower leg in this case) in the same or opposite direction to that of the other (Schmidt et al. 1990). Obviously the two people must watch each other in order to do the task. Then, either by an instruction from the experimenter or by following a metronome whose rate systematically increases, the social dyad also must speed up their movements. When moving their legs up and down in the same direction, the two members of the dyad can remain
Self-organizing Dynamical Systems
Figure 1 The potential, V(φ) of Eqn. (2) (with ∆ω l 0) and Eqn. (4) (with ∆ω 0). Black balls symbolize stable coordinated behaviors and white balls correspond to unstable behavioral states (see text for details)
synchronized throughout a broad range of speeds. However, when moving their legs in the opposite direction (one person’s leg extending at the knee while the other’s is flexing), such is not the case. Instead, at certain critical speeds participants spontaneously change their behavior so that the legs now move in the same direction. How might these social coordination phenomena be explained? The pattern variable that changes qualitatively at the transition is the relative phase, φ. When the two people move in the same direction, they are in-phase with each other, φ l 0. When they move in different directions, their behavior is antiphase (φ l pπ radians or p180 degrees). The phase relation is a good candidate for a collective variable because it clearly captures the spatiotemporal ordering between the two interacting components. Moreover, φ changes far more slowly than the variables that might describe the individual components (position, velocity, acceleration, electromyographic activity of contracting muscles, etc.) Importantly, φ changes abruptly at the transition.
The simplest dynamics that captures all the observed facts is φc l ka sinφk2b sin2φ
(1)
where φ is the relative phase between the movements of the two individuals, φ0 is the derivative of φ with respect to time, and the ratio b\a is a control parameter corresponding to the movement rate in the experiment. An equivalent formulation of Eqn. (1) is φc lkcV(φ)\cφ with V(φ) lka cosφkb cos2φ
(2)
In the literature, this is called the HKB model of coordinated behavior, after Haken et al. (1985) who formulated it as an explanation of the coordination of limb movements within a single person (Kelso 1984). Figure 1 (top) allows one to develop an intuitive understanding of the behavior of Eqn. (2), as well as to connect to the key concepts of stability and instability in self-organizing dynamical systems introduced 13847
Self-organizing Dynamical Systems earlier. The dynamics can be visualized as a particle moving in a potential function, V(φ). The minima of the potential are points of vanishing force, giving rise to stable solutions of the HKB dynamics. As long as the speed parameter (b\a) is slow, meaning the cycle period is long, Eqn. (2) has two stable fixed point attractors, collective states at φ l 0 and φ l pπ rad. Thus, two coordinated behavioral patterns coexist for exactly the same parameter values, the essentially nonlinear feature of bistability. Such bi- and in general multistability is common throughout the behavioral and social sciences. Ambiguous figures (old woman or young girl? Faces or vase?) are well-known examples (e.g., Kruse and Stadler 1995). As the ratio b\a is decreased, meaning that the cycle period gets shorter as the legs speed up, the formerly stable fixed point at φ l pπ rad. becomes unstable, and turns into a repellor. Any small perturbation will now kick the system into the basin of attraction of the stable fixed point (behavioral pattern) at φ l 0. Notice also that once there, the system’s behavior will stay in the in-phase attractor, even if the direction of the control parameter is reversed. This is called hysteresis, a basic form of memory in nonlinear dynamical systems. What about the individual components making up the social dyad? Research has established that these take the form of self-sustaining oscillators, archetypal of all time-dependent behavior whether regular or not. The particular functional form of the oscillator need not occupy the reader here. More important is the nature of the nonlinear coupling that produces the emergent coordination. The simplest, perhaps fundamental coupling that guarantees all the observed emergent properties of coordinated behavioral patterns—multistability, flexible switching among behavioral states, and primitive memory, is K l (X} kX} ) oαjβ(X kX )#q (3) "# " # " # where X and X are the individual components and α " coupling # parameters. Notice that the ‘next and β are level up,’ the level of behavioral patterns and the dynamical rule that governs them (Eqns.(1) and (2)), can be derived from the level below, the individual components and their nonlinear interaction. One may call this constructive reductionism: by focusing on adjacent levels, under the task constraint of interpersonal coordination, the individual parts can be put together to create the behavior of the whole. The basic self-organized dynamics, Eqns. (2) and (3) have been extended in numerous ways, only a few of which are mentioned here. (a) Critical slowing down and enhancement of fluctuations. Introducing stochastic forces into Eqns. (1) and (2) allows key predictions to be tested and quantitatively evaluated. Critical slowing is easy to understand from Fig. 1 (top). As the minima at φ l pπ become shallower and shallower, the time it takes to adjust to a small perturbation takes longer and longer. Thus, 13848
the local relaxation time is predicted to increase as the instability is approached because the restoring force (given as the gradient in the potential) becomes smaller. Likewise, the variability of φ is predicted to increase due to the flattening of the potential near the transition point. Both predictions have been confirmed in a wide variety of experimental systems, including recordings of the human brain. (b) Symmetry breaking. Notice that Eqns. (1) and (2) are symmetric: the dynamical system is 2π periodic and is identical under left-right reflection (φ is the same as kφ). This assumes that the individual components are identical, which is seldom, if ever the case. Nature thrives on broken symmetry. To accommodate this fact, a term ∆ω is incorporated into the dynamics φc l ∆ωka sinφk2b sin2φ, and V(φ) lk∆ωφka cosφkb cos2φ
(4)
for the equation of motion and the potential respectively. Small values of ∆ω shift the attractive fixed points (Fig. 1 middle) in an adaptive manner. For larger values of ∆ω the attractors disappear entirely (Fig. 1 bottom) causing the relative phase to drift: no coordination between the components occurs. Note, however, that the dynamics still retain some curvature (Fig. 1 bottom right): even though there are no attractors there is still attraction to where the attractors used to be. The reason is that the difference (∆ω) between the individual components is sufficiently large that they do their own thing, while still retaining a tendency to cooperate. This is how global integration, in which the component parts are locked together, is reconciled with the tendency of the parts to function locally, as individual autonomous units (Kelso 1995). (c) Information: a new principle? Unlike the behavior of inanimate things, the self-organizing dynamics of human behavior is fundamentally informational, though not in the standard sense of data communicated across a channel (Shannon and Weaver 1949). Rather, collective variables are context-dependent and intrinsically meaningful. Context-dependence does not imply lack of reproducibility. Nor does it mean that every new context requires a new collective variable or order parameter. As noted above, within- and between-person coordinated behaviors are described by exactly the same self-organizing dynamics. One of the consequences of identifying the latter is that in order to modify or change the system’s behavior, any new information (say a task to be learned, an intention to change one’s behavior) must be expressed in terms of parameters acting on system-relevant collective variables. The benefit of identifying the latter is that one knows what to modify. Likewise, the collective variable dynamics—prior to the introduction of new information—influences how that information is used. Thus, information is not lying out there as mere data: information is meaningful
Self-organizing Dynamical Systems to the extent that it modifies, and is modified by, the collective variable dynamics. (d) Generalization. The basic dynamics (Eqns. (1–4)) can readily be elaborated as a model of emergent coordinated behavior among many anatomically different components. Self-organized behavioral patterns such as singing in a group or making a ‘wave’ during a football game are common, yet unstudied examples. Recently, Neda et al. (2000) have examined a simpler group activity: applause in theater and opera audiences in Romania and Hungary. After an exceptional performance, initially thunderous incoherent clapping gives way to slower, synchronized clapping. Measurements indicate that the clapping period suddenly doubles at the onset of the synchronized phase, and slowly decreases as synchronization is lost. This pattern is a cultural phenomenon in many parts of Europe: a collective request for an encore. Increasing frequency (decreasing period) is a measure of the urgency of the message, and culminates in the transition back to noise when the performers reappear. These results are readily explained by a model of a group of globally coupled nonlinear oscillators (Kuramoto 1984) dφk K N l ωkj sin(φjkφk) dt N j= "
(5)
in which a critical coupling parameter, Kc determines the different modes of clapping behavior. K is a function of the dispersion (D) of clapping frequencies Kc l
pπ2$ D
(6)
as the interactions among disciplines continues to grow. Up to now, the use of nonlinear dynamics is still quite restricted, and largely metaphorical. One reason is that the tools are difficult to learn, and require a degree of mathematical sophistication. Their implementation in real systems is nontrivial, requiring a different approach to experimentation and observation. Another reason is that the dynamical perspective is often cast in opposition to more conventional theoretical approaches, instead of as an aid to understanding. The former tends to emphasize decentralization, collective decision-making and cooperative behavior among many interacting elements. The latter tends to focus on individual psychological processes such as intention, perception, attention, memory, and so forth. Yet there is increasing evidence that intending, perceiving, attending, deciding, emoting, and remembering have a dynamics as well. The language of dynamics serves to bridge individual and group processes. In each case, dynamics must be filled with content, with key variables and parameters obtained for the system under study. Every system is different, but what we learn about one may aid in understanding another. What may be most important are the principles and insights gained when human brains and human behavior are seen in the light of selforganizing dynamics. See also: Chaos Theory; Computational Psycholinguistics; Dynamic Decision Making; Emergent Properties; Evolution: Self-organization Theory; Hume, David (1711–76); Neural Systems and Behavior: Dynamical Systems Approaches; Stochastic Dynamic Models (Choice, Response, and Time)
Bibliography During fast clapping, synchronization is not possible due to the large dispersion of clapping frequencies. Slower, synchronized clapping at double the period arises when small dispersion appears. Period doubling rhythmic applause tends not to occur in big open-air concerts where the informational coupling among the audience is small. K can also be societally imposed. In Eastern European communities during communist times, synchronization was seldom destroyed because enthusiasm was often low for the ‘great leader’s’ speech. For people in the West, the cultural information content of different clapping patterns may be quite different. Regardless, the mathematical descriptions for coordinated behavior—of social dyads and the psychology of the crowd—are remarkably similar.
6. Conclusion and Future Directions The theoretical concepts and methods of self-organizing dynamics are likely to play an ever greater role in the social, behavioral, and cognitive sciences, especially
Abraham R H, Shaw C D 1982 Dynamics: The Geometry of Behaior. Ariel Press, Santa Cruz, CA Arbib M A, Erdi P, Szentagothai J 1998 Neural Organization: Structure, Function and Dynamics. MIT Press, Cambridge, MA Haken H 1977 Synergetics, an Introduction: Non-equilibrium Phase Transitions and Self-organization in Physics, Chemistry and Biology. Springer, Berlin Haken H 1996 Principles of Brain Functioning. Springer, Berlin Haken H, Kelso J A S, Bunz H 1985 A theoretical model of phase transitions in human hand movements. Biological Cybernetics 51: 347–56 Kelso J A S 1984 Phase transitions and critical behavior in human bimanual coordination. American Journal of Physiology: Regulatory, Integratie and Comparatie 15: R1000–4 Kelso J A S 1995 Dynamic Patterns: The Self-organization of Brain and Behaior. MIT Press, Cambridge, MA Kelso J A S, Fuchs A 1995 Self-organizing dynamics of the human brain: Critical instabilities and Sil’nikov chaos. Chaos 5(1): 64–9 Kuramoto Y 1984 Chemical Oscillations, Waes and Turbulence. Springer-Verlag, Berlin Magnusson D 1995 Individual development: A holistic, integrated model. In: Moen P, Elder G H Jr, Lu$ scher K (eds.)
13849
Self-organizing Dynamical Systems Examining Lies in Context. American Psychological Association, Washington, DC Mandelbrot B 1982 The Fractal Geometry of Nature. Freeman, New York Neda Z, Ravasz E, Brechet Y, Vicsek T, Barabasi A L 2000 The sound of many hands clapping. Nature 403: 849–50 Nicolis G, Prigogine I 1977 Self-organization in Nonequilibrium Systems. Wiley, New York Port R F, van Gelder T 1995 Mind as Motion: Explorations in the Dynamics of Cognition. MIT Press, Cambridge, MA Saltzman E L, Kelso J A S 1987 Skilled actions: A task dynamic approach. Psychological Reiew 94: 84–106 Schmidt R C, Carello C, Turvey M T 1990 Phase transitions and critical fluctuations in the visual coordination of rhythmic movements between people. Journal of Experimental Psychology: Human Perception and Performance 16: 227–47 Shannon C E, Weaver W 1949 The Mathematical Theory of Communication. University of Illinois Press, Chicago Strogatz S H 1994 Nonlinear Dynamics and Chaos. AddisonWesley, Reading, MA Thelen E, Smith L B 1994 A Dynamic Systems Approach to the Deelopment of Cognition and Action. MIT Press, Cambridge, MA Vallacher R R, Nowak A 1997 The emergence of dynamical social psychology. Psychological Inquiry 8: 73–99 Van Geert P 1995 Learning and development: A semantic and mathematical analysis. Human Deelopment 38: 123–45 Zanone P G, Kelso J A S 1992 The evolution of behavioral attractors with learning: Nonequilibrium phase transitions. Journal of Experimental Psychology: Human Perception and Performance 18/2: 403–21
J. A. S. Kelso
Self: Philosophical Aspects There are two related ways in which philosophical reflection on the self may usefully be brought to bear in social science. One concerns the traditional problem about personal identity. The other concerns the distinctive kinds of social relations into which only selves can enter. Sustained reflection on the second issue reveals striking analogies between some of the social relations into which only selves can enter and certain first personal relations that only a self can bear to itself. These analogies entail two possibilities that bear on the first issue concerning personal identity, namely, the possibility that there could be group selves who are composed of many human beings and the possibility that there could be multiple selves within a single human being. Section 1 outlines the classical problem of personal identity that was originally posed by Locke and continues to generate philosophical debate. Section 2 discusses some of the distinctive social relations into which only selves can enter. Section 3 draws several analogies between such social relations and first 13850
personal relations. Section 4 shows how these analogies point to the possibilities of group and multiple selves. It closes with some cautionary remarks about how overlooking these two possibilities can lead to confusions about methodological individualism.
1. The Philosophical Problem about Personal Identity To have a self is to be self-conscious in roughly the sense that Locke took to be the defining mark of the person. In his words, a person is ‘a thinking, intelligent being, that has reason and reflection, and can think itself as itself in different times and places’ (Locke 1979). In general, all of these capabilities—for thought, reason, reflection, and reflexive self-reference—tend to be lumped together by philosophers under the common heading of self-consciousness. Except in some generous accounts of animal intelligence, it is generally assumed that the only self-conscious beings are human beings. Locke famously disagreed. This was not because he believed that there are other species of animal besides the human species that actually qualify as persons (though he did discuss at some length a certain parrot that was reported to have remarkable conversational abilities). It was rather because he thought that the condition of personal identity should not be understood in biological terms at all; it should be understood, rather, in phenomenological terms. A person’s identity extends, as he put it, exactly as far as its consciousness extends. He saw no reason why the life of a person so construed, as a persisting center of consciousness, should necessarily coincide with the biological lifespan of a given human animal. In defense of this distinction between personal and animal identity he offered the following thought experiment: first, imagine that the consciousnesses of a prince and a cobbler are switched each into the other’s body and, then, consider who would be who thereafter. He thought it intuitively obvious that the identity of the prince is tied to his consciousness in such a way that he would remain the same person—would remain himself—even after having gained a new, cobbling body (similar remarks apply to the cobbler). Many objections have been raised against Locke’s argument and conclusion. It still remains a hotly disputed matter among philosophers whether he was right to distinguish the identity of the person from the identity of the human animal (Parfit 1984, Perry 1975, Rorty 1979). This philosophical dispute has not drawn much attention from social scientists who have generally assumed against Locke that the life of each individual self or person coincides with the life of a particular human being. The anti-Lockean assumption is so pervasive in the social sciences that it serves as common ground even among those who take
Self: Philosophical Aspects opposing positions on the issue concerning whether all social facts are reducible to facts about individuals or whether there are irreducibly social facts (see Indiidualism ersus Collectiism: Philosophical Aspects; Methodological Indiidualism: Philosophical Aspects)). Both sides take for granted that the only individuals in question are human beings. Prima facie, it seems reasonable that social scientists should disregard Locke’s distinction between the self or person and the human being. For something that neither Locke nor neo-Lockeans have ever done is offer an actual instance of his distinction. All they have argued for is the possibility that a person’s consciousness could be transferred from one human body to another and, in every case, they have resorted to thought experiments in their attempts to show this. There is a significant difference between Locke’s original thought experiment about the prince and the subsequent neo-Lockean experiments. Whereas his simply stipulates that a transfer of consciousness has occurred, the latter typically describe some particular process by which the transfer of consciousness is supposed to be accomplished—as in, for example, brain transplantation or brain reprogramming. However, although this might make the neo-Lockean thought experiments seem more plausible, there is no realistic expectation that what they ask us to imagine will ever actually come to pass (Wilkes 1988). Because this is so, they have no particular bearing on the empirical concerns of social science. Nevertheless, it is important to be clear about whether the anti-Lockean assumption that informs social science is correct. For the way in which we conceive the boundaries that mark one individual self off from another will inevitably determine how we conceive the domain of social relations that is the common subject of all the social sciences. So we need to be absolutely clear about whether it is right to assume, against Locke, that the identity of the individual is at bottom a biological fact. As the next two sections will show, there are reasons to side with Locke. But these reasons differ from Locke’s own in several respects. Unlike his, they are not derived from thought experiments. Furthermore, they support a somewhat different version of his distinction between the identity of the self or person and the human being. Finally, they invite the expectation that this version of his distinction can be realized in fact and not just in imagination—which makes it relevant to social science after all.
2. The Sociality of Seles: Rational Relations Starting roughly with Hegel, various philosophers have mounted arguments to the effect that selfconsciousness is a social achievement that simply cannot be possessed by beings who stand outside of all social relations (Wittgenstein 1953, Davidson 1984).
The common conclusion of these arguments is overwhelmingly plausible. But even if the conclusion were shown to be false, it would remain true that many selfconscious beings are social and, furthermore, their social relations reflect their self-consciousness in interesting ways. In order to gain a full appreciation of this fact, we must shift our attention away from the phenomenological dimension of self-consciousness that Locke emphasized in his analysis of personal identity, and shift our attention to the other dimension of self-consciousness that he emphasized in his definition of the person as a ‘being with reason and reflection.’ One need not cite Locke as an authority in order to see that reflective rationality is just as crucial to selfhood as its phenomenological dimension. For it obviously will not do to define the self just in phenomenological terms, as a center of consciousness, and this is so even if Locke was right to analyze the identity of the self or person in such terms. After all, some sort of center of consciousness is presupposed by all sentience, even in the case of animals that clearly do not have selves, such as wallabies or cockatoos. It also will not do to suppose that selfhood is in place whenever the sort of consciousness that goes together with sentience is accompanied by the mere capacity to refer reflexively to oneself. For, assuming that wallabies and cockatoos can represent things (which some philosophers deny (Davidson 1984)), it is very likely that some of their representations will function in ways that are characteristic of self-representation. Consider, for example, mental maps in which such animals represent their spatial relations to the various objects they perceive; such maps require a way to mark the distinction between their own bodies and other bodies, a distinction which is not inaptly characterized as the distinction between self and other (Evans 1982). Self-consciousness in the sense that goes together with selfhood involves a much more sophisticated sort of self-representation that incorporates a conception of oneself as a thinking thing and, along with it, the concept of thought itself. Armed with this concept, a self-conscious being cannot only represent itself; it can self-ascribe particular thoughts and, in doing so, it can bring to bear on them the critical perspective of reflective rationality. Whenever selves in this full sense enter into social relations, they enter into relations of reciprocal recognition. Each is self-conscious; each conceives itself as an object of the other’s recognition; each conceives the other as so conceiving itself (i.e., as an object of the other’s recognition); each conceives both as having a common nature and, in so doing, each conceives both as so conceiving both (i.e., as having a common nature). Since this reciprocally recognized common nature is social as well as rational, it affords the possibility of the social exercise of rational capacities. Thus, corresponding to the capacity to ascribe thoughts to oneself, there is the capacity to ascribe 13851
Self: Philosophical Aspects thoughts to others; corresponding to the capacity to critically evaluate one’s own thoughts there is the capacity to critically evaluate the thoughts of others; and together with all this comes another extraordinary social capacity, the capacity for specifically rational influence in which selves aim to move one another by appealing to the normative force of reasons. Such rational influence is attempted in virtually every kind of social engagement that takes place among selves, including: engaging in ordinary conversation and argument; offering bribes; lodging threats; bargaining; cooperating; collaborating. In all of these kinds of engagement reasons are offered up in one guise or another in the hope that their normative force will move someone else.
3. Analogies Between First Personal and Social Relations It is because selves have a rational nature that analogies arise between their first personal relations to their own selves and their social relations to one another. Here are three examples.
3.1 Critical Ealuation It is a platitude that the principles of rationality are both impersonal and universal. They lay down normative requirements that anyone must meet in order to be rational, and they constitute the standards by which eeryone evaluates their own and everyone else’s rational performances. It follows that when we evaluate our own thoughts by these standards, we are really considering them as if they were anyone’s—which is to say, as if they were not our own. We sometimes describe this aspect of critical self-evaluation with the phrase ‘critical distance.’ There is a specific action by which we accomplish this distancing: we temporarily suspend our commitment to act on our thoughts until after we have determined whether keeping the commitment would be in accord with the normative requirements of rationality. If we determine that it would not be in accord with those requirements, then the temporary suspension of our commitment becomes permanent, and our critical distance on our thoughts gives way to literally disowning them. But before getting all the way to the point of disowning our thoughts we must first have gotten to a point where our relations to our thoughts are rather like our relations to the thoughts of others. For, whether we decide to disown the thoughts under critical scrutiny or to reclaim them, the normative standards on the basis of which we do so are necessarily conceived by us as impersonal and universal. In consequence, there is a straightforward sense in which we are doing the same thing when we apply the standards to our own 13852
thoughts and when we apply them to the thoughts of others. There is this difference, though: whereas selfevaluation involves distancing ourselves from our own thoughts, evaluating the thoughts of others involves a movement in the opposite direction. We must project ourselves into their points of view so that we can appeal to their own sense of how the principles of rationality apply in their specific cases. This sort of criticism that involves projection is sometimes called ‘internal criticism,’ in contrast with the sort of ‘external criticism’ that proceeds without taking into account whether its target will either grasp it or be moved by it (Williams). This difference between internal criticism of others and self-criticism—the difference between projection and distancing—does nothing to undermine the analogies between them. On the contrary, for by these opposite movements two distinct selves can bring themselves to the very same place. That is, when one self projects itself into another’s point of view for the purposes of internal criticism, it can reach the very same place that the other gets to by distancing itself from its own thoughts for the purposes of selfcriticism. Moreover, there is a sense in which what the two selves do from that place is not merely analogous, but the very same thing. Each of the two selves brings to bear the same critical perspective of rationality on the same particular self’s thoughts. The only difference that remains is that in the one case it is a social act directed at another, whereas in the other case it is a self-directed act.
3.2 Joint Actiities Just as internal criticism of another replicates what individual selves do when they critically evaluate themselves, so also, joint activities carried out by many selves may replicate the activities of an individual self. For the purposes of such activities, distinct selves need to look for areas of overlap between (or among) their points of view, such as common ends, common beliefs about their common ends, and common beliefs about how they might act together so as to realize those ends. These areas of commonly recognized overlap can then serve as the basis from which the distinct selves deliberate and act together and, as such, the areas of overlap will serve as a common point of view which is shared by those selves. Whenever distinct selves do deliberate together from such a common point of view, they will be guided by the very same principles that ordinarily govern individual deliberation, such as consistency, closure (the principle that one should accept the implications of one’s attitudes), and transitivity of preferences. All such principles of rationality taken together define what it is for an individual to be fully or optimally rational. For this reason, they generally apply separately to each individual but not to groups of individuals. (Thus, different parties to a
Self: Philosophical Aspects dispute might be rational even though they disagree, but they cannot be rational if they believe contradictions.) The one exception is joint activities in which such groups of individuals deliberate together from a common point of view. To take a limited example, you and I might deliberate individually about a philosophical problem or we might do so together. If we do it individually, each of us will aim to be consistent in our reasoning because that is what the principles of rationality require of us individually. But these principles do not require us to do this together; which is to say, you and I may disagree about many things without compromising our ability to pursue our individual deliberations from our separate points of view in a completely rational manner. What would be compromised, however, is our ability to work on a philosophical problem together. We cannot do that without resolving at least some of our disagreements, namely, those that bear on the problem. This means that if we truly are committed to working on the problem together, then we should recognize a sort of conditional requirement of rationality, a requirement to achieve as much consistency between us as would be necessary to carry out our joint project. Although we needn’t achieve overall consistency together in order to do this—that is, we needn’t agree on every issue in every domain—nevertheless, we must resolve our disagreements about the relevant philosophical matters. When we resolve these disagreements we will be doing in degree the very same thing that individuals do when they respond to the rational requirement of consistency. The same holds for the other principles of rationality such as closure and transitivity of preferences. For, in general, whenever distinct individuals pursue ends together, they temporarily cease deliberating and acting separately from their individual points of view in order to act and deliberate together from their common point of view. They will also strive to achieve by social means within a group—albeit in diminished degree—the very sorts of rational relations and outcomes that are characteristic of the individual self (Korsgaard 1989).
3.3
Rationality oer Time
Although there is some sense in which individual points of view are extended in time, the deliberations that proceed from such points of view always take place in time from the perspective of the present. The question immediately arises, why should an individual take its future attitudes into account when it deliberates from the perspective of the present? The question becomes poignant as soon as we concede that an individual’s attitudes may change significantly over time. Thus, consider Parfit’s Russian nobleman who believes as a young man that he should give away all of his money to the poor, but who also believes that he
will become much more conservative when he is older and will then regret having given the money away. What should he do? Parfit answers as follows: if he recognizes any normative pressure toward an altruistic stance from which he is bound to take the desires of others into account, then he must extend that stance to his own future desires as well—regardless of the fact that he now disapproves of them (Parfit 1984). Other philosophers have also insisted on parity between the normative demands of prudence and altruism. Among moral philosophers, the parity is often taken in the positive way that Parfit does, as showing that we have reason to be both prudent and altruistic (Sidgewick 1981, Nagel 1970). But some philosophers have taken it in a negative way, as showing that we lack reason to be either. According to them, it follows from the fact that deliberation always proceeds from the perspective of the present that rationality does not require us to take our own future attitudes into account any more than it requires us to take the attitudes of others into account. If they are right, then the rational relations that hold within an individual over time are comparable to social relations within a group (Elster 1979, Levi 1985). However, anyone who is struck by this last suggestion—that an individual over time is, rationally speaking, analogous to a group—would do well to bear in mind what the first two analogies brought out, which is that the social exercise of rational capacities within a group, can, in any case, replicate individual rationality.
4. Group Seles and Multiple Seles It might seem implausible that the normative analogies between first personal and social relations could ever jeopardize the metaphysical distinction between them, which is grounded in whatever facts constitute the identity and distinctness of selves. According to Locke these facts are phenomenological; according to his opponents, the facts are biological. In both cases, the metaphysical facts on offer would clearly be adequate to ground the distinction between first personal and social relations, and this is so no matter how deep and pervasive the analogies between these two sorts of relations might be at the normative level. Yet there is a way of thinking about the self that calls this into question. It is the very way of thinking that predominated in Locke’s initial definition of the person as a being with ‘reason and reflection.’ When we emphasize this rational dimension of the self, we will find it natural to use a normative criterion in order to individuate selves. According to this normative criterion, we have one self wherever we find something that is subject to the normative requirements that define individual rationality. There is some disagreement among philosophers about exactly what these requirements are. But, even so, we can say this much 13853
Self: Philosophical Aspects with confidence: it is a sufficient condition for being a rational individual who is subject to the requirements of rationality that the individual have some conception of what those requirements are and is in some degree responsive to them—in the sense of at least trying to meet them when it deliberates and accepting criticism when it fails to meet them (Bilgrami 1998). Of course, such criticism might issue from without as well as within, since a rational individual in this sense would certainly have the conceptual resources with which to comprehend efforts by others to wield rational influence over it. This ensures that there is a readily available public criterion by which such rational individuals can be identified for social purposes: there is one such individual wherever there is an interlocutor whom we can rationally engage. It must be admitted that, due to the analogies between first personal and social relations, the boundaries between such individuals will not always be clear. Yet this does not mean that the normative conception of the self is inadequate or unworkable. It only means that the conception has unfamiliar and, indeed, revisionist consequences. For, if we press the analogies to their logical limit, it follows that there can be group selves comprising many human beings and, also, multiple selves within a single human being (Rovane 1998). Take the group case first. We have seen that, for the purposes of joint activities, distinct selves must find areas of overlap between their individual points of view which can then serve as a common point of view from which they deliberate and act together. We have also seen that such joint deliberations replicate individual rationality in degree. Now, suppose that this were carried to the following extreme: a group of human beings resolved to engage only in joint activities; accordingly, they resolved to rank all of their preferences about such joint activities together and to pool all of their information; and, finally, they resolved always to deliberate jointly in the light of their jointly ranked preferences and pooled information. The overlap between their points of view would then be so complete that they no longer had separate points of view at all. They would still have separate bodies and separate centers of consciousness. But, despite their biological and phenomenological separation, they would nevertheless all share a single rational point of view from which they deliberated and acted as one. Consequently, they could also be rationally engaged in the ways that only individual selves can be engaged. There is good reason to regard such a group as a bona fide individual in its own right (Korsgaard 1989, Rovane 1998, Scruton 1989). Now take the multiple case. Recall that it might be rational for an individual to disregard its future attitudes when it deliberates about what to do in the present, in just the way that it might be rational (though possibly immoral) to disregard the attitudes of others. If the normative demands of rationality are thus confined to one’s own present attitudes, then 13854
one’s future self will be no more bound by one’s present deliberations than one is presently bound to take one’s future attitudes into account. In other words, one’s future self will be free to disregard any long-term intentions or commitments that one is thinking of undertaking now. The point is not that there are no such things as long-term intentions or commitments. The point is that when one makes them one does not thereby bind one’s future self in any causal or metaphysical sense. It is always up to one’s future self to comply or not as it wishes, and one could not alter this aspect of one’s situation except by undermining one’s future agency altogether. Although this may seem alarming, it does not follow that individual selves do not endure over time. All that follows is that the unity of an individual self over time is constituted by the same sorts of facts that constitute the unity of a group self. Just as the latter requires shared commitments to joint endeavors so too does the former, the only difference being that the latter involves sharing across bodies while the former involves sharing across time. This serves to bring out that the metaphysical facts in terms of which Locke and his opponents analyze the identity of the self over time are, in a sense, normatively impotent. The fact that there is a persisting center of consciousness, or a persisting animal body, does not suffice to bind the self together rationally over time. And, arguably (though this is more counterintuitive), such facts do not even suffice to bind the self together rationally in the present either. That is why we are able to understand the phenomenon of dissociative identity disorder, formerly labeled multiple personality disorder. Human beings who suffer from this disorder exhibit more than point of view, each of which can be rationally engaged on its own. Since none of these points of view encompasses all of the human host’s attitudes, there is no possibility of rationally engaging the whole human being. At any given time one can engage only one part of it or another. Each of these parts functions more or less as a separate self, as is shown precisely by the susceptibility of each to rational engagement on its own. Although these multiple selves cannot be rationally engaged simultaneously, there is good reason to suppose that they nevertheless exist simultaneously. While one such self is talking, its companions are typically observing and thinking on their own (this generally comes out in subsequent conversation with them). Furthermore, these multiple selves who coexist simultaneously within the same body are sometimes ‘co-conscious’ as well—that is, they sometimes have direct phenomenological access to one another’s thoughts. But their consciousness of one another’s thoughts does not put them into the normative first person relation that selves bear to their own thoughts; for they do not take one another’s thoughts as a basis for their deliberations and actions. Thus neither their shared body nor their shared consciousness suffices to bind multiple personalities together in the sense that is
Self-regulated Learning required by the normative conception of the self. On that conception, they qualify as individual selves in their own rights (Braude 1995, Rovane 1998, Wilkes 1988). Between these two extreme cases of group and multiple selves, the normative conception provides for a wide spectrum of cases, exhibiting differing degrees of rational unity within and among human beings. The conception says that whenever there is enough rational unity—though it is hard to say how much is enough— there is a rational being with its own point of view who can be rationally engaged as such and who, therefore, qualifies as an individual self in its own right.
5. Implications for Methodological Indiidualism It is not the case that all of the facts about the Columbia Philosophy Department can be reduced to facts about its members atomistically described. Nor is it the case that all of the facts about that department can be explained by appealing to facts about its members atomistically described. Why? Because sometimes the members of the department stop reasoning from their own separate human-size points of view and begin to reason together from the point of view of the department’s distinctive projects, needs, opportunities and so on. When the members of the department do this, they give over a portion of their human lives to the life of the department in a quite literal sense, so that the department itself can deliberate and act from a point of view that cannot be equated with any of their individual points of view. Outside the context of this article, this claim about the Columbia Philosophy Department might naturally be taken to contradict methodological individualism. This is because it is generally assumed that the only individuals there are or can be human individuals. That is what leads many to interpret the facts about the Columbia Philosophy Department—insofar as they cannot be properly described or explained by appealing to facts about its individual human members atomistically described—as irreducibly social phenomena. However, this interpretation fails to take account of the way in which the department instantiates the rational properties that are characteristic of the individual self (vs. a social whole). See also: Person and Self: Philosophical Aspects; Personality and Conceptions of the Self; Self: History of the Concept; Social Identity, Psychology of
Bibliography Bilgrami A 1998 Self-knowledge and resentment. In: Wright C et al. (eds.) Knowing Our Own Minds. Oxford University Press, Oxford, UK BraudeS E1995FirstPersonPlural:MultiplePersonalityDisorder and the Philosophy of Mind. Rowman Littlefield, Lanham, MD
Davidson D 1984 Inquires into Truth and Interpretation. Oxford University Press, New York Elster J 1979 Ulysses and the Sirens: Studies in Rationality and Irrationality. Cambridge University Press, New York Evans G 1982 Varieties of Reference. Oxford University Press, New York Gilbert M 1989 On Social Facts. Routledge, London Korsgaard C 1989 Personal identity and the unity of agency: A Kantian response to parfit. Philosophy and Public Affairs 28(2) Levi I 1985 Hard Choices. Cambridge University Press, New York Locke J 1979. In: Niddich P (ed.) An Essay Concerning Human Understanding. Oxford University Press, Oxford, UK Nagel T 1970 The Possibility of Altruism. Clarendon, Oxford, UK Parfit D 1984 Reasons and Persons. Clarendon, Oxford, UK Perry J (ed.) 1975 Personal Identity. University of California Press, Berkeley, CA Rorty A O (ed.) 1979 The Identities of Perons. University of California Press, Berkeley, CA Rovane C 1998 The Bounds of Agency: An Essay in Reisionary Metaphysics. Princeton University Press, Princeton, NJ Scruton R 1989 Corporate persons. Proceedings of the Aristotelian Society 63 Sidgewick H 1981 The Methods of Ethics. Hackett, Indianapolis, IN Wilkes K V 1988 Real People: Personal Identity Without Thought Experiments. Oxford University Press, New York Wittgenstein L 1953 Philosophical Inestigations. Macmillan, New York
C. Rovane
Self-regulated Learning Self-regulated learning refers to how students become masters of their own learning processes. Neither a mental ability nor a performance skill, self-regulation is instead the self-directive process through which learners transform their mental abilities into taskrelated skills in diverse areas of functioning, such as academia, sport, music, and health. This article will define self-regulated learning and describe the intellectual context in which the construct emerged, changes in researchers’ emphasis over time as well as current emphases, methodological issues related to the construct, and directions for future research.
1. Defining Self-regulated Learning Self-regulated learning involves metacognitive, motivational, and behavioral processes that are personally initiated to acquire knowledge and skill, such as goal setting, planning, learning strategies, self-reinforcement, self-recording, and self-instruction. A selfregulated learning perspective shifts the focus of educational analyses from students’ learning abilities and instructional environments as fixed entities to students’ self-initiated processes for improving their 13855
Self-regulated Learning methods and environments for learning. This approach views learning as an activity that students do for themselves in a proactive way, rather than as a covert event that happens to them reactively as a result of teaching experiences. Self-regulated learning theory and research is not limited to asocial forms of education, such as discovery learning, self-education through reading, studying, programmed instruction, or computer assisted instruction, but can include social forms of learning such as modeling, guidance, and feedback from peers, coaches, and teachers. The key issue defining learning as self-regulated is not whether it is socially isolated but rather whether the learner displays personal initiative, perseverence, and adaptive skill in pursuing it. Most contemporary selfregulation theorists have avoided dualistic distinctions between internal and external control of learning and have envisioned self-regulation in broader, more interactive terms. Students can self-regulate their learning not only through covert cognitive means but also through overt behavioral means, such as selecting, modifying, or constructing advantageous personal environments or seeking social support. A learner’s sense of self is not limited to individualized forms of learning but includes self-coordinated collective forms of learning in which personal outcomes are achieved through the actions of others, such as family members, team-mates, or friends, or through use of physical environment resources, such as tools. Thus, covert self-regulatory processes are viewed as reciprocally interdependent with behavioral, social, and environmental self-regulatory processes. Self-regulation is defined as a variable process rather than as a personal attribute that is either present or absent. Even the most helpless learners attempt to control their functioning, but the quality and consistency (i.e., quantity) of their processes are low. Novice learners rely typically on naive forms of selfregulatory processes, such as setting nonspecific distal goals, using nonstrategic methods, inaccurate forms of self-monitoring, attributions to uncontrollable sources of causation, and defensive self-reactions. By contrast, expert learners display powerful forms of self-regulatory processes, especially during the initial phase of learning. Student efforts to self-regulate their learning has been analyzed in terms of three cyclical learning phases. Forethought phase processes anticipate efforts to learn and include self-motivational beliefs, such as self-efficacy, outcome expectations, intrinsic interest, as well as task analysis skills, such as planning, goal setting, and strategy choice. Performance phase processes seek to optimize learning efforts and include use of time management, imagery, self-verbalization, and self-observation processes. Self-reflection phase processes follow efforts to learn and provide understanding of the personal implication of outcomes. They include self-judgment processes, such as selfevaluation and attributions, and self-reactive processes, such as self-satisfaction and adaptive\defensive 13856
inferences. Because novice learners fail to use effective forethought processes proactively, such as proximal goal setting and powerful learning strategies, they must rely on reactive processes occurring after learning attempts that often have been unsuccessful. Such unfortunate experiences will trigger negative selfevaluations, self-dissatisfaction, and defensive selfreflections—all of which undermine self-motivation necessary to continue cyclical efforts to learn. By understanding self-regulation in this cyclical interactive way, qualitative as well as quantitative differences in process can be identified for intervention.
2. Intellectual Context for Self-regulated Learning Research Interest in students’ self-regulated learning as a formal topic emerged during the 1970s and early 1980s out of general efforts to study human self-control. Promising investigations of children’s use of self-regulatory processes like goal setting, self-reinforcement, selfrecording, and self-instruction, in such areas of personal control as eating and task completion prompted educational researchers and reformers to consider their use by students during academic learning. Interest in self-regulation of learning was also stimulated by awareness of the limitations of prior efforts to improve achievement that stressed the role of mental ability, social environmental background of students, or qualitative standards of schools. Each of these reform movements viewed students as playing primarily a reactive rather than a proactive role in their own development. In contrast to prior reformers who focused on how educators should adapt instruction to students based on their mental ability, sociocultural background, or achievement of educational standards, self-regulation theorists focused on how students could proactively initiate or substantially supplement experiences designed to educate themselves. Interest in self-regulation of learning emerged from many theoretical sources during the 1970s and 1980s. For example, operant researchers adapted the principles and technology of B. F. Skinner for personal use, especially the use of environmental control, selfrecording, and self-reinforcement. Their preference for the use of single-subject research paradigms and time series data was especially useful for individuals seeking greater self-regulation of learning. During this same time period, phenomenological researchers shifted from monolithic, global views of how selfperceptions influenced learning to hierarchical, domain-specific views and began developing new selfconcept tests to assess functioning in specific academic domains. The research of Hattie Marsh, Shavelson, and others was especially influential in gaining new currency for the role of academic domain self-concepts in learning. During this era, social learning researchers, such as Bandura, Schunk, and Zimmerman,
Self-regulated Learning shifted their emphasis from modeling to self-regulation and renamed the approach as social cognitive. They identified a new task-specific motive for learning—self-efficacy belief, and they linked it empirically to other social cognitive processes, such as goal setting, self-observation, self-judgment, and self-reaction. During this era, researchers such as Corno, Gollwitzer, Heckhausen, Kuhls, and others, resurrected volitional notions of self-regulation to explain human efforts to pursue courses of learning in the face of competing events. In their view, self-regulatory control of action can be undermined by ruminating, extrinsic focusing, and vacillating—which interfere with the formation and implementation of an intention. Also during the 1970s and 1980s, suppressed writings of Vygotsky were published in English that explained more fully how children’s inner speech emerges from social interactions and serves as a source of self-control. Cognitive behaviorists, such as Meichenbaum, developed models of internalization training on the basis of Vygotsky’s description of how overt speech becomes self-directive. During this same era, cognitive constructivists shifted their interest from cognitive stages to metacognition and the use of learning strategies to explain self-regulated efforts to learn. The research and theory of Flavell played a major role in effecting this transition by describing self-regulation in terms of metacognitive knowledge, self-monitoring, and control of learning. Research on self-regulation was also influenced by the emergence of goal theories during the 1970s and 1980s. Locke and Lathan showed that setting specific, proximal, challenging but attainable goals greatly influenced the effectiveness of learners’ efforts to learn. Theorists such as Ames, Dweck, Maehr, Midgley, and Nicholls identified individual differences in goal orientations that affected students’ efforts to learn on their own. These researchers found that learning or mastery goal orientations facilitated persistence and effort during self-directed efforts to learn, whereas performance or ego goal orientations curtailed motivation and achievement. During this same period, another perspective emerged that focused on the role of intrinsic interest in learning. Deci, Harackiewicz, Lepper, Ryan, Vallerand, and others demonstrated that perceptions of personal control, competence, or interest in a task were predictive of intrinsic efforts to learn on one’s own. This focus on intrinsic motivation was accompanied by a resurgence of research on the role of various forms of interest in self-directed learning by a host of scholars in Europe as well as elsewhere, such as Eccles, Hidi, Krapp, Renninger, Schiefele, Wigfield, and others. During these same decades, research on self-regulation of learning was influenced by cybernetic conceptions of how information is processed. Researchers such as Carver, Scheier, and Winne demonstrated the important role of executive processes, such as goal setting and self-monitoring, and feedback control loops in self-directed efforts to learn.
3. Changes in Emphasis Oer Time Before the 1980s, researchers focused on the impact of separate self-regulatory processes, such as goal setting, self-efficacy, self-instruction, volition, strategy learning, and self-management with little consideration for their broader implications regarding student learning of academic subject matter. Interest in the latter topic began to coalesce in the mid-1980s with the publication of journal articles describing various types of selfregulated learning processes, good learning strategy users, self-efficacious learners, and metacognitive engagement, among other topics (Jan Simons and Beukhof 1987, Zimmerman 1986). By 1991, a variety of theories and some initial research on self-regulated learning and academic achievement was published in special journal articles and edited textbooks (Maehr and Pintrich 1991, Zimmerman and Schunk 1989). These accounts of academic learning, which described motivational and self-reactive as well as the metacognitive aspects of self-regulation, spurred considerable research. By the mid-1990s, a number of edited texts had been published chronicling the results of this first wave of descriptive research and experimental studies of self-regulated learning (Pintrich 1995, Schunk and Zimmerman 1994). The success of these empirical studies stimulated interest in systematic interventions to students’ self-regulated learning and the results of these implementations emerged in journal articles and textbooks by the end of the 1990s (Schunk and Zimmerman 1998).
4. Emphases in Current Theory and Research There is much current interest in understanding the self-motivational aspects of self-regulation as well as the metacognitive aspects (Heckhausen and Dweck 1998). Self-regulated learners are distinguished by their personal initiative and associated motivational characteristics, such as higher self-efficacy beliefs, learning goal orientations, favorable self-attributions, and intrinsic motivation, as well as by their strategic and self-monitoring competence. The issue of selfmotivation is of both practical as well as theoretical importance. On the practical side, researchers often confront apathy or helplessness when they seek to improve students’ use of self-regulatory processes, and they need viable methods for overcoming this lack of motivation. On the theoretical side, researchers need to understand how motivational beliefs interact with learning processes in a way that enhances students’ initiative and perseverance. A number of models have included motivational and learning features as interactive components, such as Pintrich’s self-schema model, Boerkaert’s three-layered model, Kuhl’s action\state control model, and Bandura, Schunk, and Zimmerman’s cyclical phase model. These models are designed to transcend conceptual barriers between 13857
Self-regulated Learning learning and motivational processes and to understand their reciprocal interaction. For example, causal attributions are not only expected to affect students’ persistence and emotional reactions but also adaptations in their methods of learning. Each of these models seeks to explain how learning can become selfmotivating and can sustain effort over obstacles and time. A second issue of current interest is the acquisition of self-regulatory skills in deficient populations of learners. Study skill training programs have been instituted with diverse populations and age groups of students. These intervention programs have involved a variety of formats, such as separate strategy or study skill training classes, strategy training classes linked explicitly to subject matter classes, such as in mathematics or writing, or personal trainer services in classrooms and counseling centers. This research is uncovering a fascinating body of evidence regarding the metacognitive, motivational, and behavioral limitations of at-risk students. For example, naive learners often overestimate their knowledge and skill, which can lead to understudying, nonstrategic approaches, procrastination, and faulty attributions. Successful academic functioning requires accurate self-perceptions before appropriate goals are set and strategies are chosen. Many of these interventions are predicated on models that envision self-regulatory processes as cyclically interdependent, and a goal of these approaches is to explain how self-fulfilling cycles of learning can be initiated and sustained by students.
5. Methodological Issues Initial efforts to measure self-regulated learning processes relied on inventories in which students are asked to rate their use of specific learning strategies, various types of academic beliefs and attitudes, typical methods of study, as well as their efforts to plan and manage their study time. Another method is the use of structured interviews that involved open-ended questions about problematic learning contexts, such as writing a theme, completing math assignments, and motivating oneself to study under difficult circumstances. The latter form of assessment requires students to create their own answers, and experimenters to train coders to recognize and classify various qualitative forms of self-regulatory strategies. Although both of these approaches have reported substantial correlations with measures of academic success, they are limited by their retrospective or prospective nature—their focus on the usual or preferred methods of learning. As such, they depend on recall or anticipatory knowledge rather than on actual functioning under taxing circumstances. To avoid these limitations, there have been efforts to study self-regulation on-line at various key points during learning episodes using speak-aloud, experimenter questioning, and 13858
various performance and outcome measures. Experimental studies employing the latter measures offer the best opportunity to test the causal linkage among various self-regulatory processes, but, even this approach has potential shortcomings, such as inadvertent cuing or interference with self-regulatory processes. Two common research design issues have emerged in academic interventions with at-risk populations of students: the lack of a suitable control group, and the reactive effects of self-regulatory assessments. Educators have asked self-regulation researchers to provide assistance to students who are often in jeopardy of expulsion, and it is unethical to withhold treatment from some of these students for experimental purposes. One solution is to use intensive within-subject, time-series designs in which treatments are introduced in sequential phases. However, as students begin to collect and graph data on themselves, they become more self-observant and self-reflective—which can produce unstable baselines. This, in turn, can confound causal inferences about the effectiveness of other self-regulatory components of the intervention.
6. Future Directions for Research and Theory 6.1 Role of Technology The computer has been recommended as an ideal instrument to study and enhance students’ self-regulation. Program menus can be faded when they are no longer needed and performance processes and outcomes can be logged in either a hidden or overt fashion. Computers provide the ultimate feedback to the experimenter or the learner because results can be analyzed and graphed in countless ways to uncover underlying deficiencies. Winne is developing a specially designed computer learning environment designed to study and facilitate self-regulation by the student. 6.2 Deelopmental Research Relatively little attention has been devoted to forms of self-regulation that can be performed by young children. Very young children have difficulty observing and judging their own functioning, are not particularly strategic in their approach to learning, and tend to reason intuitively from beliefs rather than evidence. Their self-judgments of competence do not match their teachers’ judgments until approximately the fifth grade. However, there is reason to expect that simple forms of self-regulatory processes begin to emerge during the early years in elementary school. For example, there is evidence that children can make selfcomparisons with their earlier performance at the outset of elementary school. These issues are con-
Self-regulation in Adulthood nected to the key underlying issue of how selfregulation of learning develops in naturalistic as well as designed instructional contexts. 6.3 Out-of-school Influences There is increasing research showing that nonacademic factors such as peer groups, families, and part-time employment strongly affect school achievement. Schunk and Zimmerman have discussed the forms of social influence that these social groups can have on students’ self-regulatory development. Brody and colleagues have found that parental monitoring of their children’s activities and standard setting regarding their children’s performance were very predictive of the children’s academic as well as behavior selfregulation, which in turn was predictive of their cognitive and social development. Martinez-Pons has recently reported that parental modeling and support for their children’s self-regulation was predictive of the youngsters’ success in school. More attention is needed about the psychosocial origins of self-regulatory competence in academic learning. 6.4
Bibliography Heckhausen J, Dweck C S (eds.) 1998 Motiation and Selfregulation Across the Lifespan. Cambridge University Press, Cambridge, UK Jan Simons P R, Beukhof R (eds.) 1987 Regulation of Learning. SVO, Den Haag, The Netherlands Maehr M L, Pintrich P R (eds.) 1991 Adances in Motiation and Achieement: Goals and Self-regulatory Processes. JAI Press, Greenwich, CT, Vol. 7 Pintrich P (ed.) 1995 New Directions in College Teaching and Learning: Understanding Self-regulated Learning. Jossey-Bass, San Francisco, No. 63, Fall Schunk D H, Zimmerman B J (eds.) 1994 Self-regulation of Learning and Performance: Issues and Educational Applications. Erlbaum, Hillsdale, NJ Schunk D H, Zimmerman B J (eds.) 1998 Self-regulated Learning: From Teaching to Self-reflectie Practice. Guilford Press, New York Zimmerman B J (ed.) 1986 Self-regulated learning. Contemporary Educational Psychology 11, Special Issue Zimmerman B J, Schunk D H (eds.) 1989 Self-regulated Learning and Academic Achieement: Theory, Research, and Practice. Springer, New York
B. J. Zimmerman
Role of Teachers
The recent focus on academic intervention research has uncovered evidence that teachers often conduct classrooms where it is difficult for students to selfregulate effectively. For example, teachers who fail to set specific instructional goals, are ambiguous or inconsistent about their criteria for judging classroom performance, give ambiguous feedback about schoolwork, make it difficult for students to take charge of their learning. Of course, students who enter such classes with well-honed self-regulatory skills possess personal resources that poorly self-regulated learners do not possess. Such self-regulatory experts can turn to extra-classroom sources of information, can deduce subtle unspecified criteria for success, and can rely on self-efficacious beliefs derived from earlier successful learning experiences. Regarding the development of self-regulated learners, few teachers ask students to make systematic self-judgments about their schoolwork, and as a result, students are not prompted or encouraged to use selfregulatory subprocesses such as self-observation, selfjudgment, and self-reactions. Students who lack awareness of their functioning have little reason to try to alter their personal methods of learning. Finally, teachers who run classrooms where students’ have little personal choice over their goals, methods, and outcomes of learning can undermine students’ perceptions of control and assumption of responsibility for their classroom outcomes. See also: Learning to Learn; Motivation, Learning, and Instruction; Self-efficacy: Educational Aspects
Self-regulation in Adulthood Self-regulation is one of the principal functions of the human self, and it consists of processes by which the self manages its own states and actions so as to pursue goals, conform to ideals and other standards, and maintain or achieve desired inner states. Many experts use self-regulation interchangeably with the everyday term self-control, although some invoke subtle distinctions such as restricting self-control to refer to resisting temptation and stifling impulses.
1. Scope of Self-regulation Most knowledge about self-regulation can be grouped into four broad and one narrower category. The most familiar is undoubtedly impulse control, which refers to regulating one’s behavior so as not to carry out motivated acts that could have harmful or undesirable consequences. Dieting, responsible sexual behavior, recovery from addiction, and control of aggression all fall in the category of impulse control. A second category is affect regulation, or the control of emotions. The most common form of this is the attempt to bring oneself out of a bad mood or bring an end to emotional distress. In principle, however, affect regulation can refer to any attempt to alter any mood or emotion, including all attempts to induce, end, or prolong either positive or negative emotions. Affect regulation is widely regarded as the most difficult and problematic of the major spheres, because most people 13859
Self-regulation in Adulthood cannot change their moods and emotions by direct control or act of will, and so people must rely on indirect strategies, which are often ineffective. The third category is the control of thought. This includes efforts to stifle unwanted ideas or to concentrate on a particular line of thought. It can also encompass efforts to direct reasoning and inference processes, such as in trying to make a convincing case for a predetermined conclusion or to think an issue carefully through so as to reach an accurate judgment. The fourth major category is performance regulation. Effective performance often requires selfregulation, which may include making oneself put forth extra effort or persevere (especially in the face of failure), avoid choking under pressure, and make optimal tradeoffs between speed and accuracy. A narrower category of self-regulation involves superordinate regulation, which is sometimes called self-management. These processes cut across others and involve managing one’s life so as to afford promising opportunities. Choosing challenges or tasks (such as college courses) that are appropriately suited to one’s talents, avoiding situations that bring hardto-resist temptations, and conserving one’s resources during stressful periods fall in this category.
2. Importance of Self-regulation The pragmatic importance of self-regulation can scarcely be understated. Indeed, most of the personal and social problems afflicting citizens of modern, highly developed countries involve some degree of failure at self-control. These problems include drug and alcohol abuse, addiction, venereal disease, unwanted pregnancy, violence, gambling, school failure and dropping out, eating disorders, drunk driving, poor physical fitness, failure to save money, excessive spending and debt, child abuse, and behavioral binge patterns. Experts regularly notify people that they could live longer, healthier lives if they would only quit smoking, eat right, and exercise regularly, but people consistently fail to regulate themselves sufficiently in those three areas. Longitudinal research has confirmed the enduring benefits of self-regulation. Mischel and his colleagues (e.g., Mischel 1996) found that children who were better able to exercise self-control (in the form of resisting temptation and delaying gratification) at age 4 years were more successful socially and academically over a decade later. Thus, these regulatory competencies appear to be stable and to yield positive benefits in multiple spheres over long periods of time. Self-regulation enables people to resist urges for immediate gratification and adaptively pursue their enlightened self-interest in the form of long-term goals. Self-regulation also has considerable theoretical importance. As one of the self’s most important and adaptive functions, self-regulation is central to the effective functioning of the entire personality. The processes by which the self controls itself offer im13860
portant insights into how the self is structured and it operates. Higgins (1996) has analyzed the ‘sovereignty of self-regulation,’ by which he means that selfregulation is the master or supreme function of the self.
3. Feedback Loops Psychologists have borrowed important insights from cybernetic theory to explain self-regulatory processes (e.g., Powers 1973). The influential work by Carver and Scheier (1981, 1982, 1998) analyzed self-awareness and self-regulation processes in terms of feedback loops that control behavior. In any system (including mechanical ones such as thermostats for heating\ cooling systems), control processes depend on monitoring current status, comparing it with goals or standards, and initiating changes where necessary. The basic form of a feedback loop is summarized in the acronym TOTE, which stands for test, operate, test, exit. The test phase involves assessing the current status and comparing it with the goals or ideals. If the comparison yields a discrepancy, some operation is initiated that is designed to remedy the deficit and bring the status into line with what is desired. Repeated tests are performed at intervals during the operation so as to gauge the progress toward the goals. When a test reveals that the desired state has been reached, the final (exit) step is enacted, and the regulatory process is ended. Feedback loops do not necessarily exist in isolation, of course. Carver and Scheier (1981, 1982) explained that people may have hierarchies of such loops. At a particular moment, a person’s actions may form part of the pursuit of long-term goals (such as having a successful career), medium-term goals (such as doing well in courses that will furnish qualifications for that career), and short-term goals (such as persisting to finish a particular assignment in such a course). Once a given feedback loop is exited, indicating the successful completion of a short-term act of selfregulation (e.g., finishing the assignment), the person may revert back to aiming at the long-term goals. Emotion plays a central role in the operation of the feedback loop (Carver and Scheier 1998). Naturally, reaching a goal typically brings positive emotions, but such pleasant feelings may also arise simply from making suitable progress toward the goal. Thus, emotion may arise from the rate of change of discrepancy between the current state and the goal or standard. Meanwhile, negative emotions may arise not only when things get worse but also simply when one stands still and thereby fails to make progress. Carver and Scheier (1998) phrase this as a ‘cruise control’ theory of affect and self-regulation, invoking the analogy to an automobile’s cruise control mechanism. Like that mechanism, emotion kicks in to regulate the process whenever the speed deviates from the prescribed rate of progress toward the goal.
Self-regulation in Adulthood Although this analysis has emphasized feedback loops that seek to reduce discrepancies between real and ideal circumstances, there are also ‘negative feedback loops’ in which the goal is to increase the discrepancy between oneself and some standard. For example, people may seek to increase the discrepancy between themselves and the average person.
4. Self-regulation Failure Two main types of self-regulatory failure have been identified (for a review, see Baumeister et al. 1994). The more common is underregulation, in which the person fails to exert the requisite self-control. For example, the person may give in to temptation or give up prematurely at a task. The other category is misregulation, in which person exerts control over self but does so in some counterproductive manner so that the desired result is thwarted. Several common themes and factors have been identified in connection with underregulation. One is a failure to monitor the self, which prevents the feedback loop from functioning. Thus, successful dieters often count their calories and keep track of everything they eat. When they cease to monitor how much they eat, such as when watching television or when distracted by emotional distress, they may eat much more without realizing it. Alcohol contributes to almost every known pattern of self-control failure, including impulsive violence, overeating, smoking, gambling and spending, and emotional excess. This pervasive effect probably occurs because alcohol reduces people’s attention to self and thereby impairs their ability to monitor their own behavior (see Hull 1981). Emotional distress likewise contributes to underregulation, and although there are multiple pathways by which it produces this effect, one of them is undoubtedly impairment of the self-monitoring feedback loop. A perennial question is whether underregulation occurs because the self is weak or because the impulse is too strong. The latter invokes the concept of ‘irresistible impulses,’ which is popular with defense lawyers and addicts hoping to imply that no person could have resisted the problematic urge. After reviewing the evidence, Baumeister et al. (1994) concluded that most instances of underregulation involve some degree of acquiescence by the individual, contrary to the image of a valiant person being overwhelmed by an irresistible impulse. For example, during a drinking binge, the person will actively participate in procuring and consuming alcohol, which contradicts claims of having lost control of behavior and being passively victimized by the addiction. Yet clearly people would prefer to resist these failures and so at some level do experience them as against their will. Self-deception and inner conflict undoubtedly contribute to muddying the theoretical waters. At present, then, the relative contributions of powerful
impulses and weak or acquiescent self-control have not been fully understood, except to indicate that both seem to play some role. Misregulation typically involves some erroneous understanding of how self and world interact. In one familiar example, sad or depressed people may consume alcohol in the expectation that it will make them feel better, when in fact alcohol is a depressant that often makes them feel even worse. The strategy therefore backfires. By the same token, when under pressure, people may increase their attention to their own performance processes, under the assumption that this extra care will improve performance, but such attention can disrupt the smooth execution of skills, causing what is commonly described as ‘choking under pressure.’
5. Strength and Depletion The successful operation of the feedback loop depends on the ‘operate’ stage, in which the person actually changes the self so as to bring about the desired result. Recent work suggests that these operations involve a limited resource akin to energy or strength. The traditional concept of ‘willpower’ thus is being revived in psychological theory. Willpower appears to be a limited resource that is depleted when people exert self-control. In laboratory studies, if people perform one act of self-regulation, they seem to be impaired afterwards. For example, after resisting the temptation to eat forbidden chocolates, people are less able to make themselves keep trying on a difficult, frustrating task (Baumeister et al. 1998). Whenever there are multiple demands on selfregulation, performance gradually deteriorates and afterwards the self appears to suffer from depletion of resources (Muraven and Baumeister 2000). Although the nature of this resource is not known, several important conclusions about it are available. First, it appears that the same strength or energy is used for many different acts of self-regulation, as opposed to each sphere of self-control using a different facility. Trying to quit smoking or keep a diet may therefore impair a person’s ability to persist at work tasks or to maintain control over emotions. Second, the resource is sufficiently limited that even small exertions cause some degree of depletion. These effects do not necessarily imply pending exhaustion; rather, people conserve their resources when partly depleted. Third, the resource is also used for making decisions and choices, taking responsibility, and exercising initiative. Fourth, there is some evidence that willpower can be increased by regular exercise of selfcontrol. Fifth, rest and sleep seem to be effective at replenishing the resource. The effects of ego depletion provide yet another insight into self-control failure. This helps explain the familiar observation that self-control tends to deteriorate when people are working under stress or 13861
Self-regulation in Adulthood pressure, because they are expending their resources to coping with the stress and therefore have less available for other regulatory tasks.
6. Ironic Processes Another contribution to understanding self-regulatory failure invokes the notion of different mental processes that can work at cross-purposes. Wegner (1994) proposed that effective self-regulation involves both a monitoring process, which remains vigilant for threat or danger (including, for example, the temptation to do something forbidden), and an operating process, which compels the self to act in the desirable fashion. When the monitoring process detects a threat, the operating process helps to prevent disaster, such as when the dieter refuses the offer of dessert. Unfortunately, the monitoring process is generally more automatic than the operating process, so the monitor may continue to notice temptations or other threats even when the operating process is not working. Upon completion of a diet, for example, the dieter may stop saying no to all tempting foods, but the monitoring process continues to draw attention to every delicious morsel that becomes available, so that the person feels constantly tempted to eat. Likewise, when people are tired or depleted, the monitoring system may continue to seek out troublesome stimuli, with disastrous results.
Bibliography Baumeister R F, Bratslavsky E, Murave M, Tice D 1998 Ego depletion: Is the active self a limited resource? Journal of Personality and Social Psychology 74: 1252–65 Baumeister R F, Heatherton T F, Tice D 1994 Losing Control: How and Why People Fail at Self-regulation. Academic Press, San Diego, CA Carver C S, Scheier M F 1981 Attention and Self-regulation. Springer, New York Carver C S, Scheier M F 1982 Control theory: A useful conceptual framework for personality—social, clinical and health psychology. Psychological Bulletin 92: 111–35 Carver C S, Scheier M F 1998 On the Self-regulation of Behaior. Cambridge University Press, New York Higgins E 1996 The ‘self digest’: Self-knowledge serving selfregulatory functions. Journal of Personality and Social Psychology 71: 1062–83 Hull J G 1981 A self-awareness model of the causes and effects of alcohol consumption. Journal of Abnormal Psychology 90: 586–600 Mischel W 1996 From good intentions to willpower. In: Gollwitzer P M, Bargh J A (eds.) The Psychology of Action. Guilford, New York Muraven M, Baumeister R F 2000 Self-regulation and depletion of limited resources: Does self-control resemble a muscle? Psychological Bulletin 126: 247–59 Powers W T 1973 Behaior: the Control of Perception. Aldine Publishing Company, Chicago Wegner D M 1994 Ironic processes of mental control. Psychological Reiew 101: 34–52
R. F. Baumeister
7. Conclusion Effective self-regulation depends on three ingredients. First, one must have clear and unconflicting standards, such as clear goals or ideals. Second, one must monitor the self consistently, such as by keeping track of behavior. Third, one must have the wherewithal to produce the necessary changes in oneself, and that may include willpower (for direct change) or knowledge of effective strategies (for indirect efforts, such as for changing emotions). Problems in any area can impair self-regulation. Several areas for future research on self-regulation are promising. The nature of the psychological resource that is expended must be clarified, and brain processes may shed valuable light on how self-regulation works. Interpersonal aspects of selfregulation need to be explicated. The replenishment of depleted resources will also help elucidate the nature of self-control in addition to being of interest in their own right. Individual differences in selfregulation and developmental processes for acquiring self-control need considerably more study. Applications to clinical, industrial, and relationship phenomena need to be examined. Effective self-regulation includes the ability to adapt oneself to diverse circumstances. It is a vital aspect of human success in life. 13862
Self-regulation in Childhood Self-regulation has several meanings in the psychology literature, some of which have conceptual links with each other. Three definitions of self-regulation are provided below, but only the first is the theme of this article.
1. Definitions Self-regulation refers to the development of children’s ability to follow everyday customs and valued norms embraced and prescribed by their parents and others. Self-regulation is a vital constituent of the socialization process. A broad construct, self-regulation encompasses a diverse set of behaviors such as compliance, the ability to delay actions when appropriate, and modulation of emotions, motor, and vocal activities as suitable to norm-based contexts. The self in selfregulation is an essential feature, and involves a sentient self—one that recognizes and understands the reasons for standards and evaluates one’s own actions in relation to others’ feelings and needs. Thus the hallmark of self-regulation is the ability to act in accordance with various family and social values in
Self-regulation in Childhood the absence of external monitors, across a variety of situations, but neither slavishly nor mindlessly. A second meaning of self-regulation refers to various physiological or psychobiological processes that function adaptively to situational demands, and often do not involve normative standards. Examples include the regulation of intensity of arousal subsequent to emotion producing events, or the control of centrally regulated physiological and perceptual systems to novel events. Centrally regulated systems include electrodermal responses, vagal tone and heart rate, and attention control. Variation in these responses is often studied for links to temperament styles, control of arousal states, and emotion control and coping (e.g., Rothbart et al. 1995). A third view of self-regulation comes from recent perspectives in motivation. Here, with an emphasis on an individual’s goals and the processes that shape the person’s actions, a major role is given to the self’s role and to evaluation of gains and losses (e.g., Heckhausen and Dweck 1998). Autonomy, control, self-integrity, and efficacy are essential in order for the individual to be effective in the pursuit of desired goals. Within the motivational framework, substantial attention is directed to understanding how adults facilitate or inhibit the child’s autonomy. As noted earlier, this entry is concerned with the development of self-regulation to family and sociocultural standards. Related articles include Coping across the Lifespan; Control Behaior: Psychological Perspecties; and Self-regulation in Adulthood.
2. Historical Contexts The quest to understand the origins and development of self-regulation was a late twentieth century conceptual endeavor grounded in a clinical issue: why do some children as young as two and three years of age reveal high levels of dysregulated behaviors? Examples include resistance to parental requests, difficulty with family routines, high levels of activity, and unfocused attention often associated with an inability ‘to wait.’ At the time, earlier findings (e.g., impulse control, resistance to discipline) had implicated factors such as ineffective parenting, as well as distortions in the child’s own cognitive or language skills. With rare exceptions, few attempts had been made to understand antecedents, age-related correlates, and consequences of early self-regulated or dys-regulated behaviors. It seemed that a more comprehensive and integrative perspective could lead to a broadened developmental model, and provide insights about multiple contributors to early-appearing problematic behaviors. This was the impetus for a new view of self-regulation (Kopp 1982). Family and social standards were increasingly emphasized because of their relevance for self-regulation (Kopp 1987, Gralinski and Kopp 1993).
Self-regulation is not a unique psychological construct, rather it has ideational and behavioral links to courses of action variously labeled as self-management, self-control, and will. These emphases are longstanding and can be found in biblical passages, early philosophical writings, in manuals for parents during the seventeenth century and on to the present, and in the essays of William James, Freud, and John Dewey. More recent thinking can be found in several domains of psychology: developmental, personality, social learning, psychopathology. The enduring historical importance underscores the crucial fact that indiidual adherence to behavioral norms provides essential support for the structure and function of family and sociocultural groups. Societal norms are grounded in historical time and locale, and are reflected by caregivers’ socialization practices. Changes in practices have implications for the kinds of behaviors expected of children (including those associated with self-regulation). The dramatic changes in the sociocultural scene in the USA since the mid-1900s provide an example. At the end of the twentieth century, parenting was primarily authoritarian and children were expected to conform to standards without dissent. However, a mid-century transformation in child rearing practices occurred. The change was linked to two decades of economic hardship, a devastating war, quests for family life among returning veterans, and a pediatrician (Benjamin Spock) whose counsel to mothers about trusting their own judgement would resonate with new parents. In time, as a more relaxed style of parenting emerged, there was increasing tolerance for children to assert themselves about norms. Parents began to reveal a willingness to negotiate with their children about a variety of things including how norms could be met. At the beginning of the twenty-first century, a reciprocal style of parenting (often termed authoritative; Baumrind 1967) is far more common than the more inflexible authoritarian approach. For social scientists, this changing socialization orientation led to new or recast research topics. Studies focused on the characteristics and consequences of authoritative parenting, children’s autonomy needs, and understanding how the self-determination needs of the individual balance with the value based requirements of social–cultural groups. The challenges to define and explain the balancing process continue.
3. Socialization and Self-regulation 3.1 Principles Despite changing emphases in socialization, three principles remain a constant; these have implications for the development of self-regulation. First, most sociocultural groups expect primary authority figures such as parents to begin the active process of socialization. These individuals actively expose older infants 13863
Self-regulation in Childhood and toddlers to family norms and practices. Later, these socializing agents will be joined by others who typically include teachers, age mates, friends, and neighbors. New socializing agents may reinforce previous norms, expand the interpretation of norms, or introduce new ones. Children are expected to adapt their behavior accordingly. Second, parents tend to be mindful—perhaps implicitly—of a child’s developmental capabilities, the family’s functional needs, and sociocultural standards when they teach norms. Protection of the child from harm seems to be the most salient initial child-rearing value. Then the content of parents’ socialization efforts progresses to other family concerns, and from there to the broader context of neighborhood, community, and societal norms. Across sociocultural groups, there are assumptions that toddlers and preschoolers will learn a few specific family and neighborhood norms, whereas school-aged children will adopt family and cultural customs and moral standards, formal laws, and economic values. Adolescents are expected to prepare themselves for social, emotional, and economic independence using as infrastructure the norms of the family and culture. The third principle is that socialization and selfregulation are adaptive, bidirectional processes. Children and adolescents are not passive recipients in the socialization process. With the growth of their own cognitive skills and recognition of their own self needs, children show an increasing desire to have a say in defining the everyday customs that they encounter, and the ways customs are followed. Thus each succeeding generation in modern, democratic sociocultural groups modifies the content or enactment of customs and norms in some way. An implication for children is that effective self-regulation represents thoughtful, rather than indiscriminate, adherence to family practices and social norms. 3.2 Studying Self-regulation in the Context of Socialization Self-regulation is often used as a conceptual frame for research rather than a design element with the three themes of compliance, delay, and behavioral modulation. Rather, most research has been focused on the study of compliance during the preschool period (typically, from 3 to 5 years of age). Although this emphasis is warranted—adherence to norms is considered to be a crucial developmental task for this age (Sroufe and Rutter 1984)—important knowledge gaps exist. There is fragmented information about effective integration of compliance, delay, and behavioral modulation, and developmental trends among younger and older children. Recent descriptive studies also reveal considerable complexity in young children’s responses to parental requests. This finding has prompted efforts to refine operational definitions, expand the research contexts 13864
for the study of self-regulation, and utilize datacollection procedures that involve multiple measurement techniques and informants. These recent efforts have largely focused on children without major behavioral problems. Lastly, considerable research has highlighted parental and child correlates of self-regulation components, albeit sometimes within a limited age period and one or two contexts. The most complete database is available for the preschool years. 3.3 Research Data: Parents, Children, and Social Norms The timing of active socialization originates in child behaviors such as the onset of walking. When very young children locomote on their own, they discover opportunities to explore, sometimes with the potential for physical or psychological harm. Thus parents start to identify acceptable and unacceptable behaviors for the child, and attempt to obtain some measure of compliance using a variety of attention getting techniques (Schaffer 1996). These beginnings mark the dynamic and occasionally formidable interplay between parents’ socialization efforts and young children’s trajectory toward self-regulation. Although parents use their child’s behavior as a socializing cue, they also rely on family needs (e.g., the composition of a family, living arrangements), as well as societal norms. A toddler in a large family with limited living space is likely to be exposed to somewhat different restrictions to an only child living in spacious quarters. However, three overarching themes unite parents’ initial socialization efforts: young children must be protected from harming themselves or others (LeVine 1974, Gralinski and Kopp 1993); they must not tamper with family members’ possessions; and they must learn to respect others’ feelings (Dunn 1988). As children become older and their social environments increase, parental cautions about normative standards extend beyond the immediate family to peers, neighbors, and teachers, and focus on conventions and moral values, among other topics. In addition to structuring the content of socialization norms, the how of parental socialization is crucial. There is unequivocal consensus about the effect of child-rearing styles at least with EuroAmerican middle-class families. Sensitive and knowledgeable parenting is correlated with children’s compliance to norms, across age periods (Kochanska and Aksan 1995). 3.4 A Self-regulation Deelopmental Trajectory Learning to deal with parents’ norm-related requests is an all-powerful challenge. Children must bring their cognitive, linguistic, social, and emotional resources to situations that demand self-regulation. The task is most difficult for young children: they have cognitive
Self-regulation in Childhood Table 1 Self-regulation: developmental landmarks Components of self-regulation Modulation of emotions, vocalizations, and motor activities
Landmarks
Compliance
Delay\waiting (object rewards, others’ attention)
Emergent: responses tend to be situation specific and unpredictable Functional: a few predictable norm responses; some cognizance of self role Integratie: response coherence across two self-regulation components; nascent causal reasoning Internalized: understands, accepts others’ values and self role for social norms Future oriented: planful behavior for some norm contexts
Early 2nd year
Early 2nd year
Early PSa
End 2nd year; early PS
Mid-PS
Mid-PS
Late PS
Late PS
Late PS
Late PSb; S-Ac
S-A
S-A
Mid-late S-A
Mid-late S-A
Mid-late S-A
a Preschool period. b Italics indicate presumed progression. c School age years.
and language limitations; they are inclined to explore anything that looks interesting; they long for autonomy and self-assertion. It is not surprising that selfregulation takes years to mature into an effective process. Table 1 depicts the three components of selfregulation along with ages associated with developing behavioral landmarks. The ages represent approximations; the landmarks highlight behaviors that typify increasing maturity to parental norm-based requests. Using compliance as an exemplar, Emergent signifies that responses to a parent prohibition (e.g., child does not stand on a kitchen chair) may occur sporadically. Functional portrays fairly regular compliance to a rule (e.g., child does not stand on chairs, whether in the living room or kitchen); children may show signs of remorse when they do not comply. Integratie refers to predictability (i.e., coherence) in children’s behavior across different kinds of norm-based situations (e.g., a child does not yell in a market, respects a sibling’s possessions, and waits to be served dinner). Extant data reveal that compliance to family do’s and dont’s tends to be easier for young children than norm-based situations that require waiting or modulation of ongoing behaviors (also shown in Table 1). This disparity may be related to additional regulatory demands imposed on children when timing or refined behavioral nuances are critical for an effective response. In these instances, the use of strategic behaviors (e.g., self-distraction, conscious suppression efforts) may be crucial for effective self-regulation. Despite inevitable setbacks, children become more responsive to specific family and other social rules. The growth of self-regulation is facilitated by greater understanding of people and events, widening sensitivity to everyday happenings in the family and
neighborhood, and increasing ability to talk about self needs in relation to others’ needs. The importance of language in self-regulation is exemplified by the transition from outright refusals common among toddlers to attempts to talk about and negotiate task demands among older preschoolers (Klimes-Dougan and Kopp 1999). Still, children of this age may have difficulty with anger control, waiting for parental attentiveness, and sharing possessions. An extensive database on topics related to selfregulation in the preschool years suggests that correlates include gender (many girls learn behavioral controls sooner than boys), temperament characteristics such as controlled attention and inhibition of motor acts, verbal skills, effective self-distraction, and competent use of strategies (Eisenberg and Fabes 1992, Metcalfe and Mischel 1999). Overall, continuing efforts of parents to socialize their children gradually dovetail with children’s growth of their own behavioral cognitive, social, and motivational repertoire. Few details exist about how parents and children collaborate with their own resources so that children increasingly take on responsibility for self-regulation. What can be said with certainty is that self-regulation gradually emerges when children understand the reasons for values and standards, possess a cognizant self, are able to suppress an immediate goal that runs counter to family norms, and voluntarily assume responsibility for their own actions across a variety of situations. Self-generated and self-monitored adherence to norms is self-regulation. This autonomous, sometimes conscious, effort to frame activities with normative values stands in contrast to early forms of compliance that do not rely on knowledge or self-awareness. Effective self-regulation is by definition an adaptive, 13865
Self-regulation in Childhood developmental process in that children have to discover how to meet their own self needs while in general following societal standards across many settings. The task is a changing one because each age period is associated with new socialization demands imposed by parents, teachers, other individuals, and the larger social context.
4. Self-regulation: Its Value and Directions The construct of self-regulation has heightened awareness of dys-regulation and its long-term implications. Findings from related research point to long-standing problems when young children have difficulty with compliance, delay, and impulse control. To date, however, the developmental antecedents of these problems have been elusive. A renewed focus on the toddler years should be useful: two important developments occur in the second and third years that have relevance for the growth of self-regulation. These are emergent selfhood and the control of attention and consciousness. It is well known that effective norm-based behaviors require reconciliation of conflicts between self-goals and social norm demands. Among mature individuals, these conflicts are often met with reflection about courses of action, informal appraisals of costs and benefits, and plans for making reparations should they be necessary. This complex cognitive strategy is not available to young children. Understanding how very young children begin to sort through competing self and social goals may provide insights into effective paths to self-regulation. With respect to conscious learning about social norms, this almost certainly occurs, but when and how are not well understood. However, consciousness demands psychic energy, so it is in children’s best interests to sort out those family and social norms that require relatively habitual responses from those that demand additional attentional or memory efforts from themselves. How children learn to differentiate and classify standards may yield understanding about strategies useful for modulating behaviors in novel norm-based contexts, and why some children falter in these situations. See also: Parenting: Attitudes and Beliefs; Self-development in Childhood; Self-regulation in Adulthood; Socialization and Education: Theoretical Perspectives; Socialization in Infancy and Childhood; Socialization, Sociology of
Bibliography Baumrind D 1967 Child care practices anteceding three patterns of preschool behavior. Genetic Psychology Monographs 75: 43–88
13866
Dunn J 1988 The Beginning of Social Understanding. Blackwell, Oxford, UK Eisenberg N, Fabes R A 1992 Emotion, regulation, and the development of social competence. In: Clark M S (ed.) Reiew of Personality and Social Psychology: Vol. 14. Emotion and Social Behaior. Sage, Newbury Park, CA, pp. 119–50 Gralinski J H, Kopp C B 1993 Everyday rules for behavior: Mothers’ requests to young children. Deelopmental Psychology 29: 573–84 Heckhausen J M, Dweck C S 1998 Motiation and Self-regulation Across the Life Span. Cambridge University Press, New York Klimes-Dougan B, Kopp C B 1999 Children’s conflict tactics with mothers: A longitudinal investigation of the toddler and preschool years. Merrill-Palmer Quarterly 45: 226–41 Kochanska G, Aksan N 1995 Mother child mutually positive affect, the quality of child compliance to requests and prohibitions, and maternal control as correlates of early internalization. Child Deelopment 66: 236–54 Kopp C B 1982 Antecedents of self-regulation: a developmental perspective. Deelopmental Psychology 18: 199–214 Kopp C B 1987 The growth of self-regulation: caregivers and children. In: Eisenberg N (ed.) Contemporary Topics in Deelopmental Psychology. Wiley, New York, pp. 34–55 LeVine R A 1974 Parental goals: A cross-cultural view. In: Leichter H J (ed.) The Family as Educator. Teachers College Press, New York, pp. 56–70 Metcalfe J, Mischel W 1999 A hot\cool-system analysis of delay of gratification: dynamics of willpower. Psychological Reiew 106: 3–19 Rothbart M K, Posner M I, Hershey K L 1995 Temperament, attention, and developmental psychopathology. In: Cicchetti D, Cohen D J (eds.) Manual of Deelopmental Psychopathology. Wiley, New York, Vol. 1, pp. 315–40 Schaffer H R 1996 Social Deelopment. Blackwell, Oxford, UK Sroufe L A, Rutter M 1984 The domain of developmental psychopathology. Child Deelopment 55: 17–29
C. B. Kopp
Semantic Knowledge: Neural Basis of Semantic knowledge is a type of long-term memory, commonly referred to as semantic memory, consisting of concepts, facts, ideas, and beliefs (e.g., Tulving 1983). Semantic memory is thus distinct from episodic or autobiographical memories, which are unique to an individual and tied to a specific time and place. For example, answering the question ‘What does the word breakfast mean?’ requires semantic memory. In contrast, answering the question ‘What did you have for breakfast yesterday?’ requires episodic memory, to retrieve information about events in our personal past, as well as semantic memory, to understand the question. Semantic memory therefore includes the information stored in our brains that represents the meaning of words and objects. Understanding the nature of meaning, however, has proven to be a fairly intractable problem: especially
Semantic Knowledge: Neural Basis of regarding the meaning of words. One important reason why this is so is that words have multiple meanings. The specific meaning of a word is determined by context, and comprehension is possible because we have contextual representations (see Miller 1999 for a discussion of lexical semantics and context). Cognitive neuroscientists have begun to get traction on the problem of meaning in the brain by limiting inquiry to concrete objects as represented by pictures, and by their names. Consider, for example, the most common meaning of two concrete objects; a camel and a wrench. Camel is defined as ‘either of two species of large, domesticated ruminants (genus camelus) with a humped back, long neck, and large cushioned feet,’ the object wrench is defined as ‘any number of tools used for holding and turning nuts, bolts, pipes, etc.’ (Webster’s New World Dictionary 1988). Two things are noteworthy about these definitions. First, they are largely about features; camels are large and have a humped back; wrenches hold and turn things. Second, different types of features are emphasized for different types of object. The definition of the camel consists of information about its visual appearance, whereas the definition of the wrench emphasizes how it is used. Differences in the types of feature that define different objects have played, and continue to play, a central role in models of how semantic knowledge is organized in the human brain. Another point about these brief definitions is that they include only part, and perhaps only a small part, of the information we may possess about these objects. For example, we may know that camels are found primarily in Asia and Africa, that they are known as the ‘ships of the desert,’ and that the word camel can also refer to a color and a brand of cigarettes. Similarly, we also know that camels are larger than a bread box and weigh less than a 747 jumbo jet. Although this information is also part of semantic memory, little is known about the brain bases of these associative and inferential processes. Neuroscientists are, however, beginning to gain insights into the functional neuroanatomy associated with identifying objects and retrieving information about specific object features and attributes.
1. Semantic Representations A central question for investigators interested in the functional neuroanatomy of semantic memory has been to determine how information about object concepts is represented in the brain. A particularly influential idea guiding much of this work is that the features and attributes that define an object are stored in the perceptual and motor systems active when we first learned about that object. For example, information about the visual form of an object, its typical
color, its unique pattern of motion, would be stored in or near regions of the visual system that mediate perception of form, color, and motion. Similarly, knowledge about the sequences of motor movements associated with the use of an object would be stored in or near the motor systems active when that object was used. This idea has a long history in behavioral neurology. Indeed, many neurologists at the beginning of the twentieth century assumed that the concept of an object was composed of the information about that object learned through direct sensory experience and stored in or near sensory and motor cortices (e.g., Lissauer 1988, 1890).
2. Semantic Deficits Result from Damage to the Left Temporal Lobe The modern era of study on the organization of semantic knowledge in the human brain began with Elizabeth Warrington’s seminal paper ‘The selective impairment of semantic memory’ (Warrington 1975). Warrington reported three patients with progressive dementing disorders that provided neurological evidence for a semantic memory system. There were three main components to the disorder. First, it was selective. The disorder could not be accounted for by general intellectual impairment, sensory or perceptual problems, or an expressive language disorder. Second, the disorder was global, in the sense that it was neither material- nor modality-specific. Object knowledge was impaired regardless of whether objects were represented by pictures or their written or spoken names. Third, the disorder was graded. Knowledge of specific object attributes (e.g., does a camel have two, four, or six legs?) was more impaired than knowledge of superordinate category information (i.e., is a camel a mammal, bird, or insect?). Following Warrington’s report, similar patterns of semantic memory dysfunction have been reported in patients with brain damage resulting from a wide range of etiologies. These have included patients with progressive dementias such as Alzheimer’s disease and semantic dementia, herpes encephalitis, and closed head injury (for review, see Patterson and Hodges 1995). Consistent with the properties established by Warrington, these patients typically had marked difficulty producing object names under a variety of circumstances, including naming pictures of objects, naming from the written descriptions of objects, and generating lists of objects that belong to a specific category (e.g., animals, fruits and vegetables, furniture, etc.). In addition, the deficits were associated primarily with damage to the left temporal lobe, suggesting that information about object concepts may be stored, at least in part, in this region of the brain. 13867
Semantic Knowledge: Neural Basis of
3. Brain Damage Can Lead to Category-specific Semantic Deficits Other patients have been described with relatively selective deficits in recognizing, naming, and retrieving information about different object categories. The categories that have attracted the most attention are animals and tools. This is because of a large and growing number of reports of patients with greater difficulty naming and retrieving information about animals (and often other living things) than about tools (and often other man-made objects). Reports of the opposite pattern of dissociation (greater for tools than animals) are less frequent. However, enough carefully-studied cases have been reported to provide convincing evidence that these categories can be doubly dissociated as a result of brain damage. The impairment in these patients is not limited to visual recognition. Deficits occur when knowledge is probed visually and verbally, and therefore assumed to reflect damage to the semantic system or systems (for review, see Forde and Humphreys 1999). While it is now generally accepted that these disorders are genuine, their explanation, on both the cognitive and neural levels, remains controversial. Two general types of explanation have been proposed. The most common explanation focuses on the disruption of stored information about object features. Specifically, it has been proposed that knowledge about animals and tools can be disrupted selectively because these categories are dependent on information about different types of features stored in different regions of the brain. As exemplified by the definitions provided previously, animals are defined primarily by what they look like, and functional attributes play a much smaller role in their definition. In contrast, functional information, specifically how an object is used, is critical for defining tools. As a result, damage to areas where object form information is stored leads to deficits for categories that are overly-dependent on visual form information, whereas damage to regions where object use information is stored leads to deficits for categories overly-dependent on functional information. The finding that patients with categoryspecific deficits for animals also have difficulties with other visual-form-based categories, such as precious stones, provides additional support for this view (e.g., WarPrington and Shallice 1984). This general framework for explaining categoryspecific disorders was first proposed by Warrington and colleagues in the mid- to late 1980s (e.g., Warrington and Shallice 1984). Influential extensions and reformulation of this general idea have been provided by a number of investigators, including Farah and McClelland (1991), Damasio (1990), Caramazza et al. (1990), and Humphreys and Riddoch (1987). The second type of explanation focuses on broader semantic distinctions (e.g., animate v. inanimate 13868
objects), rather than on features and attributes, as the key to understanding category-specific deficits. A variant of this argument has recently been proposed by Caramazza and Shelton (1998) to counter a number of difficulties with feature-based formulations. Specifically, these investigators note that a central prediction of at least some feature-based models is that patients with a category-specific deficit for animals should have more difficulty in answering questions that probe knowledge about visual information about animals (does an elephant have a long tail?), than about function information (is an elephant found in the jungle?). As Caramazza and Shelton (1998) show, at least some patients with an animal-specific knowledge disorder (and, according to their argument, all genuine cases) have equivalent difficulty with both visual and functional questions about animals. As a result of these and other findings, Caramazza and Shelton argue that category-specific disorders cannot be explained by feature-based models. Instead, they propose that such disorders reflect evolutionary adaptations for animate objects, foods, and perhaps by default, tools, and other manufactured objects (the ‘domain-specific hypothesis’; Caramazza and Shelton 1998).
4. Functional Brain Imaging Reeals that Semantic Knowledge about Objects is Distributed in Different Regions of the Brain Functional brain imaging allows investigators to identify the neural systems active when normal individuals perform different types of task. These studies have confirmed findings with brain damaged subjects, and have begun to extend knowledge of the neural basis of semantic memory. Although functional brain imaging is a relatively new tool, the current evidence suggests that information about different object features is stored in different regions of the cerebral cortex. One body of evidence in support of this claim comes from experiments using word generations tasks. Subjects are presented with the name of an object, or a picture of an object, and required to generate a word denoting a specific feature or attribute associated with that object. In one study, positron emission tomography (PET) was used to investigate differences in patterns of brain activity when subjects generated the name of an action typically associated with an object (e.g., saying the word ‘pull’ in response to a static, achromatic line-drawing of a child’s wagon), relative to generating an associated color word e.g., saying ‘red’ in response to the wagon) (Martin et al. 1995). Relative to generating a color word, action-word generation activated a posterior region of the left temporal lobe, called the middle temporal gyrus. This location was of particular interest because it was just anterior to sites known to be active during motion
Semantic Knowledge: Neural Basis of
Figure 1 Brain regions active during retrieval of color and action knowledge. A Ventral view of the human brain showing approximate locations of regions in the occipital cortex active during color perception (black ovals), and color word generation in response to object pictures and object names (gray ovals). This region is also active when subjects generate color imagery in response to verbal commands, and in subjects with auditory word–color synethesia (for review, see Martin 2001). B Lateral view of the left hemisphere showing location of the region active during motion perception (area MT; black oval), and region of the middle temporal gyrus active when subjects generated action words to object pictures, and object names (gray oval). Location of this region is based on over 20 studies encompassing nine native languages (for review, see Martin 2001). The white oval indicates the approximate location of the region of inferior prefrontal cortex implicated in selecting, retrieving, and maintaining lexical and phonological information (for review, see Martin and Chao 2001)
perception (area MT). In contrast, relative to generating action words, color-word generation activated a region on the ventral, or underside of the temporal lobe, called the fusiform gyrus. This location was of particular interest because it was just anterior to sites known to be active during color perception. Finally, the same pattern of results was found when subjects were presented with the written names of objects, rather than an object picture (Fig. 1) (for review see Martin 2001). These findings, and findings from other studies using a wide range of semantic processing tasks, provide support for two important ideas about the neural representation of semantic knowledge. First, there is a single semantic system in the brain, rather than separate semantic systems for different modalities of input (visual, auditory) or types of material (pictures of objects, words) (e.g., Vandeberghe et al. 1996). This system includes multiple brain regions, especially in the temporal and frontal lobes of the left hemisphere (for review see Price et al. 1999, Martin 2001). Second,
information about object features and attributes are not stored in a single location, but rather are stored as a distributed network of discrete cortical areas. Moreover, the locations of the sites appear to follow a specific plan that parallels the organization of sensory systems, and, as will be reviewed below, motor systems, as well.
5. Ventral Occipitotemporal Cortex and the Representation of Object Form Another body of evidence that object concepts may be represented by distributed feature networks comes from studies contrasting patterns of neural activity associated with naming, and performing other types of tasks with objects from different categories. A feature common to all concrete objects is that they have physical shape or form. Evidence is accumulating that suggests that many object categories elicit distinct 13869
Semantic Knowledge: Neural Basis of
Figure 2 Brain regions showing object category-related patterns of activation. Approximate location of regions in the ventral temporal cortex A, and lateral cortex B, active during viewing, naming, and retrieving information about animals (black ovals) and tools (gray ovals). Blecause of their anatomic proximity to visual object form (and form-related features like color), motion, and motor processing areas, it has been suggested that information about how an object appears, moves, and is used, is stored in these regions (see text for details). These regions interact with lowerlevel visual processing areas in occipital cortex (double arrows); especially when discriminating among items of similar appearance (e.g., four-legged animals, faces). 1. Superior temporal sulcus (STS). 2. Middle temporal gyrus. 3. Premotor cortex
patterns of neural activity in regions involved in object form processing (ventral occipital and temporal cortex). Moreover, the locations of these category-related activations appear to be consistent across individual subjects and processing tasks (e.g., naming object pictures, matching pictures, reading object names). This seems to be especially so for objects defined primarily by visual form-related features such as animals, faces, and landmarks. Early reports using PET found that naming (Martin et al. 1996) and matching (Perani et al. 1995) pictures of animals resulted in greater activation of the left occipital cortex than performing these same tasks with pictures of tools. Because the occipital cortex is involved primarily in the early stages of visual processing, it was suggested that this activity reflected topdown activation from more anterior sites in the occipitotemporal object processing stream (Martin et al. 1996). This may occur whenever detailed information about visual features or form is needed to identify an object. Specifically, naming objects that differ from other members of the same category by relatively subtle differences in visual form (four-legged 13870
animals) may require access to stored information about visual detail. Retrieving this information, in turn, may require participation of early visual processing areasn the occipital cortex. A subsequent report showing that unilateral occipital lesions could result in greater difficulty in naming and retrieving information about animals than tools provided converging evidence for this view (Tranel et al. 1997). However, most patients with semantic deficits for animals have had lesions confined to the temporal lobes (for review, see Gainotti et al. 1995). Functional brain imaging of normal individuals has now provided evidence for category-related activations in the ventral region of the temporal lobes. This has been accomplished by using functional magnetic resonance imaging (fMRI), which provides better spatial resolution than was possible using PET. A number of investigators have found that distinct regions of the ventral temporal cortex show differential responses to different object categories. In one study, viewing, naming, and matching pictures of animals, as well as answering written questions about animals, were found to activate the lateral region of the fusiform
Semantic Knowledge: Neural Basis of gyrus, relative to performing these tasks with pictures and names of tools. In contrast, the medial fusiform was more active for tools than animals. A similar, but not identical, pattern of activation was found for viewing faces (lateral fusiform) relative to viewing houses (medial fusiform) (Chao et al. 1999). Other investigators have also reported face-related activity in the lateral region of the fusiform gyrus, and houserelated activity in more medial regions of the ventral temporal lobe, including the fusiform, lingual, and parahippocampal gyrus (Fig. 2) (for review, see Kanwisher et al. 2001, Martin 2001). These findings suggest that different object categories elicit activity in different regions of ventral temporal cortex, as defined by the location of their peak activation. Moreover, the typological arrangement of these peaks was consistent across subjects and tasks. Importantly, however, the activity associated with each object category was not limited to a specific region of the ventral occipitotemporal cortex, but rather was distributed over much of the region (Chao et al. 1999). Additional evidence for the distributed nature of object representations in the ventral temporal cortex comes from single cell recordings from intracranial depth electrodes implanted in epileptic patients (Kreiman et al. 2000a). Recordings from the regions of the medial temporal cortex (entorhinal cortex, hippocampus, and amygdala), which receive major inputs from the ventral temporal regions described above, identified neurons that showed highly selective responses to different object categories, including animals, faces, and houses. Moreover, the responses of the neurons were category-specific rather than stimulus-specific. That is, animal-responsive cells responded to all pictures of animals, rather than to one picture or a select few. Studies reporting similar patterns of neural activity when subjects view and imagine objects provide further support that object information is stored in these regions of cortex. For example, regions active during face perception are also active when subjects imagine famous individuals (O’Craven and Kanwisher 2000). Similar findings have been reported for viewing and imagining known landmarks (O’Craven and Kanwisher 2000), houses, and even chairs (Ishai et al. 2000). In addition, the majority of category-selective neurons recorded from human temporal cortex also responded selectively when the patients were asked to imagine these objects (Kreiman et al. 2000b). Taken together, the data suggest that the ventral occipitotemporal cortex may be best viewed, not as a mosaic of discrete category-specific areas, but rather as a lumpy feature-space, representing stored information about features of object form shared by members of a category. How this feature space is organized, and why its topological arrangement is so consistent from one subject to another, are critical questions for future investigations.
6. Lateral Temporal Cortex and the Representation of Object Motion Information about how objects move through space, and patterns of motor movements associated with their use, are other features that could aid object identification. This would be especially true for categories of manufactured objects such as tools that have a more variable mapping between their name and their visual form than a category such as four-legged animals. Thus access to these additional features may be required to identify them as unique entities. Here again, evidence is accumulating that naming and identifying objects with motion-related attributes activate areas close to regions that mediate perception of object motion (posterior region of the lateral temporal lobe) with different patterns of activity associated with biological and manufactured objects. A number of laboratories using a variety of paradigms with pictures and words have reported that tools elicit greater activity in the posterior, left middle temporal gyrus than animals and other object categories (for review, see Martin 2001). Moreover, the active region was just anterior to area MT, and overlapped with the region active in the verb generation studies discussed above. Damage to this region has been reported selectively to impair tool recognition and naming (Tranel et al. 1997). In contrast, naming animals and viewing faces elicits greater activity in the superior temporal sulcus (STS) (Fig. 2). This region is of particular interest because of its association with the perception of biological motion in monkeys as well as humans (for review, see Allison et al. 2000). As suggested for the ventral temporal cortex, neurons in the lateral temporal cortex may also be tuned to features that objects within a category share. The nature of these features is unknown; however, based on its anatomical proximity to visual motion processing areas, this region may be tuned to features of motion associated with different objects. In support of this conjecture, increased activity in the posterior lateral temporal cortex has been found when subjects viewed static pictures of objects that imply motion (Kourtzi and Kanwisher 2000, Senior et al. 2000), and when subjects focused attention on the direction of eye gaze (Hoffman and Haxby 2000). Investigation of the differences in the properties of motion associated with biological and manufactured objects may provide clues to the organization of this region.
7. Ventral Premotor Cortex and the Representation of Use-associated Motor Moements If activations associated with different object categories reflect stored information about object properties, then one would expect tools to elicit activity in 13871
Semantic Knowledge: Neural Basis of motor-related regions. Several laboratories have reported this association. Specifically, greater activation of the left ventral premotor cortex has been found for naming tools relative to naming animals, viewing pictures of tools relative to viewing pictures of animals, faces, and houses, and generating action words to tools (Fig. 2). Mental imagery (e.g., imagining manipulating objects with the right hand) has also resulted in ventral premotor activation (Fig. 2) (for review, see Martin 2001). Electrophysiological studies have identified cells in monkey ventral premotor cortex that responded not only when objects were grasped, but also when the animals viewed objects they had had experience of manipulating (Jeannerod et al. 1995). The ventral premotor activation noted in the human neuroimaging studies may reflect a similar process. These findings are consistent with reports of patients with greater difficulty in naming tools than animals following damage to the left lateral frontal cortex (for review, see Gainotti et al, 1995) and suggest that the left ventral premotor cortex may be necessary for naming and retrieving information about tools.
8. Conclusions and Future Directions Evidence from functional brain imaging studies provides considerable support for feature-based models of semantic representation. However, some of the findings could be interpreted as evidence for the ‘domainspecific’ hypothesis as well. For example, the clustering of activations associated with animals and faces, on the one hand, and tools and houses, on the other, may be viewed as consistent with this interpretation. Other evidence suggests, however, that all nonbiological object representations do not cluster together. For example, it has been reported that activity associated with a category of objects of no evolutionary significance (chairs) was located lateral to the face-responsive region (in the inferior temporal gyrus), rather than medially where tools and houses elicit their strongest activity (Ishai et al. 1999). Both functional brain imaging and patient studies suggest that object knowledge is represented in distributed cortical networks. There do not seem to be single regions that map on to whole object categories. Nevertheless, there may be a broader organization of these networks that reflects evolutionarily adapted, domain-specific knowledge systems for biological and nonbiological kinds of object. This possibility remains to be explored. Although progress is being made in understanding the neural substrate of some aspects of meaning, many important questions and issues remain. For example, semantic representations are prelexical. Yet, to be of service, they must be linked intimately to the lexicon. Little is known about how the lexicon is organized in the brain, and how lexical and semantic networks interact (Damasio et al. 1996). 13872
Another important issue concerns the neural basis of retrieval from semantic memory. Semantic knowledge, like all stored information, is not useful unless it can be retrieved efficiently. Studies of patients with focal lesions have shown that the left lateral prefrontal cortex is involved critically in word retrieval, even in the absence of a frank aphasia (e.g., Baldo and Shimamura 1998). Functional brain imaging studies have confirmed this association (Fig. 2). Moreover, recent evidence suggests that different regions of the left inferior prefrontal cortex may mediate selection among competing alternatives in semantic memory, whereas other regions may be involved in retrieving, manipulating, and maintaining semantic information in working memory (for review, see Martin and Chao 2001). Much additional work will be needed to describe the role that different regions of prefrontal cortex play in semantic processing. Another critical question will be to determine how semantic object representations are modified by experience. Some of the findings discussed here suggest that the typological arrangement among object categories in the cortex is relatively fixed. Other evidence suggests a much more flexible organization, in which the development of expertise with an object category involves a particular portion of the fusiform gyrus (Gauthier et al. 1999). Longitudinal studies tracking changes in the brain associated with learning about completely novel objects should help to clarify this issue. There is also little known about where object information unrelated to sensory or motor properties is stored (e.g., that camels live in Asia and Africa). Similarly, little is known about the representation of abstract concepts (e.g., honor, liberty, and justice), metaphors, and the like. Finally, questions concerning the neural systems involved in object category formation have been almost totally neglected. E. E. Smith and colleagues have shown that exemplar-based and rule-based categorization activate different neural structures (Smith et al. 1999). This finding suggests that categorization processes may be a fruitful area for future investigation. The advent of techniques to converge fMRI data with technologies such as magnetoencephalography (MEG) that provide temporal information on the order of milliseconds, should provide a wealth of new information on the neural basis of semantic knowledge. See also: Comprehension, Cognitive Psychology of; Dementia, Semantic; Evolution and Language: Overview; Lexical Processes (Word Knowledge): Psychological and Neural Aspects; Meaning and Rule-following: Philosophical Aspects; Memory for Meaning and Surface Memory; Semantic Similarity, Cognitive Psychology of; Semantics; Sentence Comprehension, Psychology of; Word, Linguistics of; Word Meaning: Psychological Aspects
Semantic Processing: Statistical Approaches
Bibliography Allison T, Puce A, McCarthy G 2000 Social perception and visual cues: role of the STS. Trends in Cognitie Science 4: 267–78 Baldo J V, Shimamura A P 1998 Letter and category fluency in patients with frontal lobe lesions. Neuropsychology 12: 259–67 Caramazza A, Shelton J R 1998 Domain-specific knowledge systems in the brain the animate-inanimate distinction. Journal of Cognitie Neuroscience 10: 1–34 Caramazza A, Hillis A E, Rapp B C, Romani C 1990 The multiple semantics hypothesis: multiple confusions? Cognitie Neuropsychology 7: 161–89 Chao L L, Haxby J V, Martin A 1999 Attribute-based neural substrates in temporal cortex for perceiving and knowing about objects. Nature Neuroscience 2: 913–19 Damasio A R 1990 Category-related recognition deficits as a clue to the neural substrates of knowledge. Trends in Neurosciences 13: 95–8 Damasio H, Grabowski T J, Tranel D, Hichwa R D, Damasio A R 1996 A neural basis for lexical retrieval. Nature 380: 499–505 Farah M J, McClelland J L 1991 A computational model of semantic memory impairment: Modality specificity and emergent category specificity. Journal of Experimental Psychology: General 120: 339–57 Forde E M E, Humphreys G W 1999 Category-specific recognition impairments: a review of important case studies and influential theories. Aphasiology 13: 169–93 Gainotti G, Silveri M C, Daniele A, Giustolisi L 1995 Neuroanatomical correlates of category-specific semantic disorders: a critical survey. Memory 3: 247–64 Gauthier I, Tarr M J, Anderson A W, Skudlarski P, Gore J C 1999 Activation of the middle fusiform ‘face area’ increases with expertise in recognizing novel objects. Nature Neuroscience 2: 568–73 Hoffman E A, Haxby J V 2000 Distinct representations of eye gaze and identity in the distributed human neural system for face perception. Nature Neuroscience 3: 80–4 Humphreys G W, Riddoch M J 1987 On telling your fruit from your vegetables: A consideration of category-specific deficits after brain damage Trends in Neurosciences 10: 145–48 Ishai A, Ungerleider L G, Haxby J V 2000 Distributed neural systems for the generation of visual images. Neuron 28: 979–90 Ishai A, Ungerleider L G, Martin A, Schouten J L, Haxby J V 1999 Distributed representation of objects in the ventral visual pathway. Proceedings of the National Academy of Sciences, USA 96: 9379–84 Jeannerod M, Arbib M A, Rizzolatti G, Sakata H 1995 Grasping objects: The cortical mechanisms of visuomotor transformation. Trends in Neurosciences 18: 314–20 Kanwisher N, Downing P, Epstein R, Kourtzi Z 2001 Functional neuroimaging of visual recognition. In: Cabeza R, Kingstone A (eds.) Handbook of Functional Neuroimaging of Cognition. MIT Press, Cambridge, MA Kourtzi, Z, Kanwisher N 2000 Activation in human MT\MST by static images with implied motion. Journal of Cognitie Neuroscience 12: 48–55 Kreiman G, Koch C, Fried I 2000a Category-specific visual responses of single neurons in the human medial temporal lobe. Nature Neuroscience 3: 946–53 Kreiman G, Koch C, Fried I 2000b Imagery neurons in the human brain. Nature 408: 357–61
Lissauer H. 1988 [1890] A case of visual agnosia with a contribution to theory. Cognitie Neuropsychology 5: 157–92 Martin A 2001 Functional neuroimaging of semantic memory. In: Cabeza R, Kingstone A (eds.) Handbook of Functional Neuroimaging of Cognition. MIT Press, Cambridge, MA Martin A, Chao L L 2001 Semantic memory and the brain: Structure and processes. Current Opinion in Neurobiology. 11: 194–201 Martin A, Haxby J V, Lalonde F M, Wiggs C L, Ungerleider L G 1995 Discrete cortical regions associated with knowledge of color and knowledge of action. Science 270: 102–5 Martin A, Wiggs C L, Ungerleider L G, Haxby J V 1996 Neural correlates of category-specific knowledge. Nature 379: 649–52 Miller G A 1999 On knowing a word. Annual Reiew of Psychology 50: 1–19 O’Craven K M, Kanwisher N 2000 Mental imagery of faces and places activates corresponding stimulus-specific brain regions. Journal of Cognitie Neuroscience 6: 1013–23 Patterson K, Hodges J R 1995 Disorders of semantic memory. In: Baddeley A D, Wilson B A, Watts F N (eds.) Handbook of Memory Disorders. Wiley, New York Perani D, Cappa S F, Bettinardi V, Bressi S, Gorno-Tempini M, Matarrese M, Fazio F 1995 Different neural systems for the recognition of animals and man-made tools. Neuroreport 6: 1637–41 Price C J, Indefrey P, van Turennout M 1999 The neural architecture underlying the processing of written and spoken word forms. In: Brown C M, Hagoort P (eds.) Neurocognition of Language Processing. Oxford University Press, New York Senior C, Barnes J, Giampietro V, Simmons A, Bullmore E T, Brammer M, David A S 2000 The functional neuro-anatomy of implicit motion perception of ‘representational momentum’. Current Biology 10: 16–22 Smith E E, Patalano A L, Jonides J 1999 Alternative strategies of categorization. Cognition 65: 167–96 Tranel D, Damasio H, Damasio A R 1997 A neural basis for the retrieval of conceptual knowledge. Neuropsychologia 35: 1319–27 Tulving E 1983 Elements of Episodic Memory. Oxford University Press, New York Vandenberghe R, Price C, Wise R, Josephs O, Frackowiak R S 1996 Functional anatomy of a common semantic system for words and pictures. Nature 383: 254–6 Warrington E K 1975 The selective impairment of semantic memory. Quarterly Journal of Experimental Psychology 27: 635–57 Warrington E K, Shallice T 1984 Category specific semantic impairments. Brain 107: 829–54 Webster’s New World Dictionary 1988 3rd college edn. Simon and Schuster, New York
A. Martin
Semantic Processing: Statistical Approaches The development of methods for representing meaning is a critical aspect of cognitive modeling and of applications that must extract meaning from text input. This ability to derive meaning is the key to any 13873
Semantic Processing: Statistical Approaches approach that needs to use or evaluate knowledge. Nevertheless, determining how meaning is represented and how information can be converted to this representation is a difficult task. For example, any theory of meaning must describe how the meaning of each individual concept is specified and how relationships among concepts may be measured. Thus, appropriate representations of meaning must first be developed. Second, computational techniques must be available to permit the derivation and modeling of meaning using these representations. Finally, some form of information about the concepts must be available in order to permit a computational technique to derive the meaning of the concepts. This information about the concepts can either be more structured humanbased input, for example dictionaries or links among related concepts, or less structured natural language. With the advent of more powerful computing and the availability of on-line texts and machine-readable dictionaries, novel techniques have been developed that can automatically derive semantic representations. These techniques capture effects of regularities inherent in language to learn about semantic relationships among words. Other techniques have relied on hand-coding of semantic information, which is then placed in an electronic database so that users may still apply statistical analyses on the words in the database. All of these techniques can be incorporated into methods for modeling a wide range of psychological phenomena such as language acquisition, discourse processing, and memory. In addition, the techniques can be used in applied settings in which a computer can derive semantic knowledge representations from text. These settings include information retrieval, natural language processing, and discourse analysis.
1.
discovered, then semantic concepts can be defined by their combination, or composition of these features. Thus, different concepts will have different basic semantic features or levels of these features. Concepts vary in their relatedness to each other, based on the degree to which they have the same semantic features (e.g., Smith and Medin 1981). One approach to define empirically the meaning of words using a feature system was the use of the semantic differential (Osgood et al. 1957). By having people rate how words fell on a Likert scale of different bipolar adjectives (e.g., fair– unfair, active–passive), a word could be represented as the combination of ratings. Through collecting large numbers of these ratings, Osgood could define a multidimensional semantic space in which the meaning of a word is represented as a point in that space based on the ratings on each scale, and words could be compared among each through measuring distances in the space. Feature representations have been widely used in psychological models of memory and categorization. They provide a simple representation for encoding information for input into computational models, most particularly connectionist models (e.g., Rumelhart and McClelland 1986). While the representation permits assigning concepts based on their features, it does not explicitly show relations among the concepts. In addition, in many models using feature representations, the dimensions are handcreated; therefore, it is not clear whether the features are based on real-world constructions, or are derived psychological constructions developed on the fly, based on the appropriate context. Some of the statistical techniques described below avoid the problem of using humanly defined dimensions by automatically extracting relevant dimensions.
The Representation of Meaning
Definitions of semantics generally encompass the concepts of knowledge of the world, of meanings of words, and of relationships among the words. Thus, techniques that perform analyses of semantics must have representations that can account for these different elements. Because these computational techniques rely on natural language, typically in the form of electronic text, they must further be able to convert information contained in the text to the appropriate representation. Two representational systems that are widely used in implementations of computational techniques are feature systems and semantic networks. While other methods of semantic representation (e.g., schemas) also account for semantic information in psychological models, they are not as easily specified computationally. 1.1
Feature Systems
The primary assumption behind feature systems is that if a fixed number of basic semantic features can be 13874
1.2
Semantic or Associatie Networks
A semantic network approach views the meaning of concepts as being determined by their relations to other concepts. Concepts are represented as nodes with labeled links (e.g., IS-A or Part-of ) as relationships among the nodes. Thus, knowledge is a combination of information about concepts and how those concepts relate to each other. Based on the idea that activation can spread from one node to another, semantic networks have been quite influential in the development of models of memory. Semantic networks and spreading activation have been widely used for modeling sentence verification times and priming, and have been incorporated into many localist connectionist models. Semantic networks permit economies of storage because concepts can inherit properties shared by other concepts (e.g., Collins and Quillian 1969). This basis has made semantic networks a popular approach for the development of computer-based lexicons, most particularly within the field of artificial intelligence.
Semantic Processing: Statistical Approaches Nevertheless, many of the assumptions for relatedness among concepts must still be based on handcrafted networks in which the creator of the network uses their knowledge to develop the concepts and links.
2. Statistical and Human Approaches to Deriing Meaning No matter what type of representation is used for meaning, some form of information must be stored within the representation for it to be useful. There are two primary techniques to derive representations and fill in the lexical information. The first is to handcraft the information through human judgments about definitions, categorization, organization, or associations among concepts. The second is to use automated methods to extract information from existing on-line texts. The former approach relies on human expertise, and it is assumed that humans’ introspective abilities and domain knowledge will provide an accurate representation of lexical relations. The latter approach assumes that the techniques can create a useful and accurate representation of meaning from processing natural language input.
2.1 Human Extraction of Meaning for Knowledge Bases Long before the existence of computing, humans developed techniques for recording the meaning of words. These lexical compilations include dictionaries, thesauri, and ontologies. While not strictly statistical techniques for deriving meaning, human-based methods deserve mention for several reasons. Many of these lexical compilations are now available on-line. For example, there are a number of machine readable dictionaries on-line (e.g., Wilks et al. 1996), as well as projects to develop large ontologies of domain knowledge. Because of this availability, statistical techniques can be applied to these on-line lexicons to extract novel information automatically from a lexicon, for example to discover new semantic relations. In addition, the information from existing lexical entries can be used in automatic techniques for categorizing new lexical items. One notable approach has been the development of WordNet (see Fellbaum 1998). WordNet is a handbuilt on-line lexical reference database which represents both the forms and meanings of words. Lexical concepts are organized in synonym sets (synsets) which represent semantic and lexical relations among concepts including synonymy, antonymy, hyponymy, meronymy, and morphological relations. Automated techniques (described in Fellbaum 1998) have been applied to WordNet for a number of applications including discovering new relations among words, performing automated word-sense identification, and
applying it to information retrieval through automated query expansion. Along with machine-readable dictionaries and ontologies, additional human-based approaches to deriving word similarity have collected large numbers of word associations to derive word-association norms (e.g., Deese 1965) and ratings of words on different dimensions (e.g., Osgood et al. 1957). The statistics from these collections are incorporated into computational cognitive models. In human-generated representations, however, there is much overhead involved in collecting the set of information before the lexical representation can be used. Because of this, it is not easily adapted to new languages or new domains. Further, handcrafted derivations of relationships among words do not provide a basis for a representational theory.
2.2
Automatic Techniques for Deriing Semantics
‘You shall know a word by the company it keeps’ (Firth 1957). The context in which any individual word is used can provide information about both the word’s syntactic role and the semantic contributions of the word to the context. Thus, with appropriate techniques for measuring the use of words in language, we should be able to infer the meaning of those words. While much research has focused on automatically extracting syntactic regularities from language, there has been a recent increase in research on approaches to extracting semantic information. This research takes a structural linguistic approach in its assumption that the structure of meaning (or of language in general) can be derived or approximated through distributional measures and statistical analyses. In order automatically to derive a semantic representation from analyzing language, several assumptions must be fulfilled. First it is assumed that information about the cooccurrence of words within contexts will provide appropriate information about semantic relationships. For example, the fact that ‘house’ and ‘roof’ often occur in the same context or in similar contexts informs that there must be a relationship between them. Second, assumptions must be made about what constitutes the context in which words appear. Some techniques use a moving window which moves across the text analyzing five to 10 words as a context; others use sentences, paragraphs, or complete documents as the complete context in which words appear. Finally, corpora of hundreds of thousands to millions of words of running text are required so that there are enough occurrences of information about how the different words appear within their contexts. A mathematical overview of many statistical approaches to natural language may be found in Manning and Schu$ tze (1999), and a review of statistical techniques applied to corpora may be found in 13875
Semantic Processing: Statistical Approaches Boguraev and Pustejovsky (1996). Below, we focus on a few techniques that have been applied to psychological modeling, have shown psychological plausibility, and\or have provided applications that may be used more directly within cognitive science. The models described all use feature representations of words, in which words are represented as vectors of features. In addition, the models automatically derive the feature dimensions rather than having them predefined by the researcher. 2.2.1 The HAL model. The HAL (Hyperspace Analog to Memory) model uses lexical cooccurrence to develop a high-dimensional semantic representation of words (Burgess and Lund 1999). Using a large corpus (320 million words) of naturally occurring text, they derive vector representations of words based on a 10-word moving window. Vectors for words that are used in related contexts (within the same window) have high similarity. This vector representation can characterize a variety of semantic and grammatical features of words, and has been used to investigate a wide range of cognitive phenomena. For example, HAL has been used for modeling results from priming and categorization studies, resolving semantic ambiguity, modeling cerebral asymmetries in semantic representations, and modeling semantic differences among word classes, such as concrete and abstract nouns. 2.2.2 Latent Semantic Analysis. Like HAL, Latent Semantic Analysis (LSA) derives a high-dimensional vector representation based on analyses of large corpora (Landauer and Dumais 1997). However, LSA uses a fixed window of context (e.g., the paragraph level) to perform an analysis of cooccurrence across the corpus. A factor analytic technique (singular value decomposition) is then applied to the cooccurrence matrix in order to derive a reduced set of dimensions (typically 300 to 500). This dimension reduction causes words that are used in similar contexts, even if they are not used in the same context to have similar vectors. For example, although ‘house’ and ‘home’ both tend to occur with ‘roof,’ they seldom occur together in language. Nevertheless, they would have similar vector representations in LSA. In LSA, vectors for individual words can be summed to provide measures of the meaning of larger units of text. Thus, the meaning of a paragraph would be the sum of the vectors of the words in that paragraph. This basis permits the comparison of larger units of text, such as comparing the meaning of sentences, paragraphs, or whole documents to each other. LSA has been applied to a number of different corpora, ranging from large samples of language that children and adults would have encountered to specific 13876
corpora on particular domains, such as individual course topics. For the more general corpora, it derives a generalized semantic representation of knowledge similar to that general knowledge acquired by people in life. For the domain-specific corpora, it generates a representation more similar to that of people knowledgeable within that domain area. LSA has been used both as a theoretical model and as a tool for the characterization of semantic relatedness of units of language (see Landauer et al. 1998 for a review). As a theoretical model, LSA has been used to model the speed of acquisition of new words by children, its scores overlap those of humans on standard vocabulary and subject matter tests, it mimics human word sorting and category judgments, it simulates word-word and passage-word lexical priming data, and it accurately estimates textual coherence and the learnability of texts by individual students. The vector representation in LSA can be applied within other theoretical models. For example, propositional representations based on LSA-derived vectors have been integrated into the ConstructionIntegration model, a symbolic connectionist model of language (see Kintsch 1998). As an application, LSA has been used to measure the quality and quantity of knowledge contained in essays, for matching user queries to documents in information retrieval, and for performing automatic discourse segmentation. 2.2.3 Connectionist approaches. Connectionist modeling uses a network of interacting processing units operating on feature vectors to model cognitive phenomena. It has been widely used to model aspects of language processing. Although in some connectionist models words or concepts are represented as vectors in which the features have been predefined (e.g., McClelland and Kawamoto 1986), recent models have automatically derived the representation. Elman (1990) implemented a simple recurrent network that used a moving window analyzing a set of sentences from a small lexicon and artificial grammar. Based on a cluster analysis of the activation values of the hidden units, the model could predict syntactic and semantic distinctions in the language, and was able to discover lexical classes based on word order. One current limitation, however, is that it is not clear how well the approach can scale up to much larger corpora. Nevertheless, like LSA, due to the constraint satisfaction in connectionist models, the pattern of activation represented in the hidden units goes beyond direct cooccurrence, and captures more of the contextual usage of words.
3.
Conclusions
Statistical techniques for extracting meaning from online texts and for extending the use of machinereadable dictionaries have become viable approaches
Semantic Processing: Statistical Approaches for creating semantic-based models and applications. The techniques go beyond modeling just cooccurrence of words. For example, the singular value decomposition in LSA or the use of hidden units in connectionist models permits derivation of semantic similarities that are not found in local cooccurrence but that are seen in human knowledge representations. The techniques incorporate both the idea of feature vectors, from feature-based models, and the idea that words can be defined by their relationships to other words found in semantic networks. 3.1 Adantages and Disadantages of Statistical Semantic Approaches Compared to many models of semantic memory, statistical semantic approaches are quite parsimonious, using very few assumptions and parameters to derive an effective representation of meaning. They further avoid problems of human-based meaning extraction since the techniques can process realistic environmental input (natural language) directly into a representation. The techniques are fast, requiring only hours or days to develop new lexicons, and can work in any language. They typically need large amounts of natural language to derive representation; thus, large corpora must be obtained. Nevertheless, because they are applied to large corpora, the lexicons that are developed provide realistic representations of tens to hundreds of thousands of words in a language. 3.2 Theoretical and Applied Uses of Statistical Semantics Within cognitive modeling, statistical semantic techniques can be applied to almost any model that must incorporate meaning. They can therefore be used in modeling in such areas as semantic and associative priming, lexical ambiguity resolution, anaphoric resolution, acquisition of language, categorization, lexical effects on word recognition, and higher-level discourse processing. The techniques can provide useful additions to a wide range of applications that must encode or model meaning in language; for example, information retrieval, automated message understanding, machine translation, and discourse analysis. 3.3 The Future of Statistical Semantic Models in Psychology While it is important that the techniques provide effective representations for applications, it is also important that the techniques have psychological plausibility. The study of human language processing can help inform the development of more effective methods of deriving and representing semantics. In
turn, the development of the techniques can help improve cognitive models. For this to happen, strong ties must be formed between linguists, computational experts, and psychologists. In addition, the techniques and lexicons derived from them are not widely available to all researchers. For the techniques to succeed, better distribution, either through Web interfaces or through software, will allow them to be more easily incorporated into a wider range of cognitive models. See also: Connectionist Models of Language Processing; Lexical Access, Cognitive Psychology of; Lexical Processes (Word Knowledge): Psychological and Neural Aspects; Lexical Semantics; Lexicon; Memory for Meaning and Surface Memory; Semantic Knowledge: Neural Basis of; Semantic Similarity, Cognitive Psychology of; Semantics; Word Meaning: Psychological Aspects
Bibliography Boguraev B, Pustejovsky J 1996 Corpus Processing for Lexical Acquisition. MIT Press, Cambridge, MA Burgess C, Lund K 1999 The dynamics of meaning in memory. In: Dietrich E, Markman A B (eds.) Cognitie Dynamics: Conceptual and Representational Change in Humans and Machines. Lawrence Erlbaum Associates, Mahwah, NJ, pp. 117–56 Collins A M, Quillian M R 1969 Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behaior 8: 240–7 Deese J 1965 The Structure of Associations in Language and Thought. Johns Hopkins University Press, Baltimore, MD Elman J L 1990 Finding structure in time. Cognitie Science 14: 179–211 Fellbaum C 1998 WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA Firth J R 1968 A synopsis of linguistic theory 1930–1955. In: Palmer F (ed.) Selected Papers of J.R. Firth. Longman, New York, pp. 32–52 Kintsch W 1998 Comprehension: A Paradigm for Cognition. Cambridge University Press, New York Landauer T K, Dumais S T 1997 A solution to Plato’s problem: The Latent Semantic Analysis theory of the acquisition, induction, and representation of knowledge. Psychological Reiew 104: 211–40 Landauer T K, Foltz P W, Laham D 1998 An introduction to Latent Semantic Analysis. Discourse Processes 25: 259–84 Manning C D, Schu$ tze H 1999 Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA McClelland J L, Kawamoto A H 1986 Mechanisms of sentence processing: Assigning roles to constituents. In: Rumelhart D E, McClelland J L (eds.) PDP Research Group. Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Vol. 2. MIT Press, Cambridge, MA, pp. 272–325 Osgood C E, Suci G J, Tannenbaum P H 1957 The Measurement of Meaning. University of Illinois Press, Urbana, IL Rumelhart D E, McClelland J L 1986 Parallel Distributed Processing: Explorations in the Microstructure of Cognition. MIT Press, Cambridge, MA, Vol. 1 Smith E E, Medin D L 1981 Categories and Concepts. Harvard University Press, Cambridge, MA
13877
Semantic Processing: Statistical Approaches Wilks Y, Slator B, Guthrie L 1996 Electric Words: Dictionaries, Computers and Meanings. MIT Press, Cambridge, MA
P. W. Foltz
Semantic Similarity, Cognitive Psychology of Semantic similarity refers to similarity based on meaning as opposed to form. The term is used widely throughout cognitive psychology and related areas such as psycholinguistics, memory, reasoning, and neuropsychology. For example, semantically similar words are confusable with each other, and can prime each other, with the consequence that verbal memory performance is heavily dependent on similarity of meaning (see False Memories, Psychology of; Priming, Cognitie Psychology of). In the context of reasoning, people draw inductive inferences on the basis of semantic similarity, for example, inferring that properties of cows are more likely to be true of horses than to be true of hedgehogs (Heit 2001) But what is semantic similarity? In experimental work, semantic similarity is often estimated directly from participants’ explicit judgments. However, there are several advantages to representing semantic similarity within a standardized framework or model. These advantages include the greater consistency and other known mathematical properties that can be imposed by a formal model, and the possible benefits of data reduction and the extraction of meaningful dimensions or features of the stimuli being analyzed. The history of research concerning semantic similarity can be captured in terms of the succession of models that have been applied to this topic. The models themselves can be divided into generic models of similarity and a number of more specialized models aimed at particular aspects of semantic similarity.
1. General Models of Similarity Although the following accounts are meant to address similarity in general terms, they can be readily applied to semantic similarity. The two classical, generic approaches to modeling similarity are spatial models of similarity and Tversky’s (1977) contrast model. More recently, structured approaches have addressed limitations of both of these classical accounts.
1.1 Spatial Representations Spatial models seek to represent similarity in terms of distance in a psychological space (Shepard 1980). An 13878
item’s position is determined through its coordinate values along the relevant dimensions; nearby points thus represent similar items, whereas distant items are psychologically very different. For example, a spatial representation of mammal categories might represent different animals on dimensions of size and ferocity, with lions and tigers being close on both of these dimensions (Fig. 1). The relevant space is derived using the technique of multidimensional scaling (MDS), a statistical procedure for dimensionality reduction. MDS works from experimental participants’ judgments, typically in matrices of proximity data such as pairwise confusions or similarity ratings between items. Spatial models have been used widely for visualization purposes and as the heart of detailed cognitive models, e.g., of categorization and recognition memory (Nosofsky 1991). Less widely used are related statistical procedures such as hierarchical clustering (Shepard 1980) although they have also been used in the context of semantic similarity.
1.2 The Contrast Model The traditional alternative to spatial models is Tversky’s (1977) contrast model. This account was developed to address perceived limitations of spatial models which bring these into conflict with behavioral data (but also see Nosofsky 1991). Chief among these are violations of the so-called metric axioms that underlie any spatial scheme, such as asymmetries in human similarity judgments that are not well captured by spatial distance which should be symmetrical. For example, people judged the similarity of North Korea to China to be greater than the similarity of China to North Korea. In the contrast model (Eqn. (1)), the similarity between items a and b is a positive function of the features common to both items and a negative function of the distinctive features of a and also the distinctive features of b. Each of these three feature sets is governed by a weighting parameter which allows the model to capture asymmetries according to the nature of a particular task. According to the focusing hypothesis, greater attention is given to distinctive features of the first item in a comparison than of the second item, hence α β. When China is more familiar than North Korea, having more known distinctive features, then the similarity from China to North Korea should be lower than the similarity of North Korea to China. S (a, b) l θf (AEB)kαf (AkB)kβf ( BkA)
(1)
Although applications of the contrast model to the modeling of specific cognitive tasks are fewer than those of spatial models, an application to semantic similarity can be found in Ortony (1979), who applied the model to distinctions between literal and metaphorical similarity.
Semantic Similarity, Cognitie Psychology of Tiger Lion
Wolf
Ferocity
Elephant
Dog Giraffe
Hedgehog
Pig
Size
Figure 1 Multidimensional scaling representation of mammals
1.3 Structured Representations Despite the efforts aimed at establishing the superiority of either the contrast model or spatial models, it has been argued that both approaches share a fundamental limitation in that they define similarity based on oversimplified kinds of representations: points in space or feature sets. Arguably most theories of the representation of natural objects, visual textures, sentences, etc., assume that these cannot be represented in line with these restrictions (see Feature Representations in Cognitie Psychology). Instead, they seem to require structured representations: complex representations of objects, their parts and properties, and the interrelationships between them. Descriptions such as SIBLING-OF (Linus, Lucy) and SIBLING-OF (Lucy, Linus) cannot be translated in an obvious way to either lists of features or points in space that represent the similarity between these two items as well as their differences from BROTHER-OF (Linus, Lucy). (See also Fodor and Pylyshyn (1988) for a critique of attempts in connectionist networks to represent relational structure in a featural framework.) The perceived need for accounts of similarity to work with structured representations has given rise to the structural alignment account (see Markman (2001) for an overview) which has its roots in research on analogical reasoning (see Mental Models, Psychology of). Structural alignment operates over structured representations such as frames (see Schemas, Frames, and Scripts in Cognitie Psychology) consisting of slots and fillers. The comparison process requires that at least some of the predicates, that is relations such as ABOVE (x, y ), are identical across the comparison.
These identical predicates are placed in correspondence. The alignment process then seeks to build maximal structurally consistent matches between the two representations. Structural alignment has been implemented in a variety of computational models (e.g., Falkenhainer et al. 1990, Goldstone and Medin 1994) that have been used to capture behavioral data, especially similarity judgments and analogical reasoning. Experimental results have supported an important prediction of the structural alignment account, that there will be a greater impact of alignable differences compared with nonanalignable differences. Nonalignable differences between two representations are elements of one object that have no correspondence in the other. In contrast, alignable differences refer to representational elements that have corresponding roles in the two representations but fail to match exactly. For example, imagine two tables, one with a flower on top and the other with a bowl on top in addition to a chair beneath it. Comparing the two scenes, flower vs. bowl would be an alignable difference, whereas the chair would be a nonalignable difference.
2. Specialized Models of Semantic Similarity 2.1 Semantic Differentials Leaving behind generic models of similarity, semantic similarity has also been captured through a variety of specially built models and approaches. The first of these is Osgood et al.’s (1957) semantic differential. 13879
Semantic Similarity, Cognitie Psychology of This approach used psychometric techniques to compute the psychological distance between concepts. Participants were required to rate concepts on 10–20 bipolar scales for a set of semantically relevant dimensions, such as positive–negative, feminine– masculine. The ratings of words on these dimensions would essentially form their coordinates in semantic space. The approach is thus related to spatial models of similarity with the difference primarily in the way that the semantic space is derived. Whereas models based on multidimensional scaling require pairwise proximity data for all concepts of interest, the semantic differential approach requires that all concepts of interest are rated on all relevant dimensions. Therefore, the semantic differential method suffers from the difficulty that the relevant dimensions must be stipulated in advance. Still, semantic differentials have been widely used, largely because the data are straightforward to collect, analyze, and interpret. 2.2 Semantic Networks and Featural Models Osgood et al.’s work motivated the semantic feature model of Smith et al. (1974) which addressed how people verify statements such as ‘a robin is a bird’ and ‘a chicken is a bird,’ considering the defining and characteristic features of these concepts. Verifying the latter statement would be slower owing different characteristic features for chicken vs. bird. The semantic feature model was developed in contrast to Collins and Quillian’s (1969) semantic network model. In this model semantic meaning is captured by nodes that correspond to individual concepts. These nodes are connected by a variety of links representing the nature of the relationship between the nodes (such as an IS-A link between robin and bird). Semantic similarity (or distance) is captured in terms of the number of links that must be traversed to reach one concept from another. Although the semantic network model was a groundbreaking attempt at capturing structured relations, it still suffered from some problems. For example, the model correctly predicts that verifying ‘a robin is an animal’ is slower than verifying ‘a robin is a bird’ owing to more links being traversed for the first statement. However, this model does not capture typicality effects such as the difference between robin and chicken. 2.3 High-dimensional Context Spaces Osgood et al.’s semantic differential can also be seen as a predecessor of contemporary corpus-derived measures of semantics and semantic similarity. For example, Burgess and Lund’s (1997) hyperspace analogue to language model (HAL) learns a highdimensional context from large-scale linguistic corpora that encompass many millions of words of speech or text. The model tracks lexical co-occurrences 13880
throughout the corpus and from these derives a highdimensional representational space. The meaning of a word is conceived of as a vector. Each element of this vector corresponds to another word in the model, with the value of an element representing the number of times that the two words co-occurred within the discourse samples that constitute the corpus. For example, the vector for dog will contain an element reflecting the number of times that the word ‘bone’ was found within a given range of words in the corpus. These vectors can be viewed as the coordinates of points (individual words) in a high-dimensional semantic space. Semantic similarity is then a matter of distance between points in this space. Several other such usage-based models have been proposed to date; similar in spirit to HAL, for example, is Landauer and Dumais’s (1997) latent semantic analysis (see also Semantic Processing: Statistical Approaches). The basic approach might be seen to be taking to its logical consequence Wittgensteir’s famous adage of ‘meaning as use.’ Its prime advantage over related approaches such as spatial models of similarity and the semantic differential lies in the ability to derive semantics and thus measures of semantic similarity for arbitrarily large numbers of words without the need for any especially collected behavioral data. The ability of these models to capture a wide variety of phenomena, such as results in semantic priming and effects of semantic context on syntactic processing, has been impressive. 2.4 Connectionist Approaches The final approach to semantic similarity to be discussed shares with these context-based models a statistical orientation, but connectionist modeling has been popular particularly in neuropsychological work on language and language processing. In connectionist models, the semantics of words are represented as patterns of activations, or banks of units representing individual semantic features. Semantic similarity is then simply the amount of overlap between different patterns, hence these models are related to the spatial accounts of similarity. However, the typically nonlinear activation functions used in these models allow virtually arbitrary re-representations of such basic similarities. The representation schemes utilized in these models tend to be handcrafted rather than derived empirically as in other schemes such as multidimensional scaling and high-dimensional context spaces. However, it is often only very general properties of these semantic representations and the similarities between them that are crucial to a model’s behavior, such as whether these representations are ‘dense’ (i.e., involve the activation of many semantic features) or ‘sparse,’ so that the actual semantic features chosen are not crucial. For example, this distinction between dense and sparse representation has been used to capture patterns of semantic errors
Semantics associated with acquired reading disorders (Plaut and Shallice 1993) and also patterns of category specific deficits following localized brain damage (Farah and McClelland 1991).
3. Conclusion There is a wide variety of models of semantic similarity available. Underlying this array of approaches is a fundamental tension that is as yet unresolved. Many of the models reviewed can be classed as loosely spatial in nature. These models have all been applied extensively to behavioral data, yet there are fundamental limitations to spatial approaches with respect to their representational capacity, made clear by the successes of the contrast model and the structural alignment approach. The behavioral evidence that relational structure is important to similarity seems particularly compelling. This leaves this area of research with two contrasting and seemingly incompatible strands of models, each of which successfully relates to experimental data. It is possible only to speculate on how this contradiction might ultimately be resolved. One possibility is that spatial models and structured representations capture different aspects or types of semantic similarity: one automatic, precompiled, effortless, and in some ways shallow, and one the result of more in-depth, or line, or metacognitive processing. Such a distinction between types of processing of similarity would be compatible with the success of models based on, for example, context spaces in capturing phenomena such as semantic priming, and the success of models such as those based on structural alignment in capturing phenomena such as analogy, or the understanding of novel concept combinations. Whether such a distinction will take shape or whether there will one day be a single, all-encompassing account remains for future research. See also: Categorization and Similarity Models; Categorization and Similarity Models: Neuroscience Applications; Dementia, Semantic; Lexical Semantics; Semantic Knowledge: Neural Basis of; Semantic Processing: Statistical Approaches; Semantics
Bibliography Burgess C, Lund K 1997 Modeling parsing constraints with high-dimensional context space. Language and Cognitie Processes 12: 177–210 Collins A M, Quillian M R 1969 Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behaior 8: 240–7 Falkenhainer B, Forbus K D, Gentner D 1990 The structuremapping engine: Algorithm and examples. Artial Intelligence 41: 1–63
Farah M J, McClelland J L 1991 A computational model of semantic memory impairment: Modality-specificity and emergent category-specificity. Journal of Experimental Psychology: General 120: 339–57 Fodor J, Pylyshyn Z 1988 Connectionism and cognitive architecture: A critical analysis. Cognition 28: 3–71 Goldstone R L, Medin D L 1994 The time course of comparison. Journal of Experimental Psychology: Learning, Memory, and Cognition 20: 29–50 Heit E 2001 Properties of inductive reasoning. Psychonomic Bulletin and Reiew 7: 569–92 Landauer T K, Dumais S T 1997 A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Reiew 104: 211–40 Markman A B 2001 Structural alignment, similarity, and the internal structure of category representations. In: Hahn U, Ramscar M (eds.) Similarity and Categorization. Oxford University Press, Oxford, UK Nosofsky R M 1991 Stimulus bias, asymmetric similarity and classification. Cognitie Psychology 23: 94–140 Ortony A 1979 Beyond literal similarity. Psychological Reiew 86: 161–80 Osgood C E, Suci G J, Tannenbaum P H 1957 The Measurement of Meaning. University of Illinois Press, Urbana, IL Plaut D C, Shallice T 1993 Deep dyslexia: A case study of connectionist neuropsychology. Cognitie Neuropsychology 10: 377–500 Shepard R N 1980 Multidimensional scaling, tree-fitting, and clustering. Science 210: 390–7 Smith E E, Shoben E J, Rips L J 1974 Structure and process in semantic memory: A featural model for semantic decisions. Psychological Reiew 81: 214–41 Tversky A 1977 Features of similarity. Psychological Reiew 84: 327–52
U. Hahn and E. Heit
Semantics Semantics is the study of meaning communicated through language, and is usually taken to be one of the three main branches of linguistics, along with phonology, the study of sound systems, and grammar, which includes the study of word structure (morphology) and of sentence structure (syntax). This entry surveys some of the main topics of current semantics research.
1. Introduction Traditionally, the main focus of linguistic semantics has been on word meaning, or lexical semantics. Since classical times writers have commented on the fact, noticed surely by most reflecting individuals, that the meaning of words changes over time. Such observa13881
Semantics tions are the seeds of etymology, the study of the history of words. Over longer stretches of time, such changes become very obvious, especially in literate societies. Words seem to shift around: some narrow in meaning such as English ‘queen,’ which earlier meant ‘woman, wife’ but now means ‘wife of a king.’ Others become more general, while still others shift to take on new meaning or disappear altogether. Words are borrowed from language to language. The study of such processes is now part of historical semantics (Fisiak 1985). Another motivation for the study of word meaning comes from dictionary writers as they try to establish meaning correspondences between words in different languages, or, in monolingual dictionaries, seek to provide definitions for all the words of a language in terms of a simple core vocabulary. In lexicology similarities and differences in word meaning are a central concern. The principled study of the meaning of phrases and sentences has only become established in linguistics relatively recently. Thus it is still common for descriptive grammars of individual languages to contain no separate section on semantics other than providing a lexicon. Nonetheless it has always been clear that one can identify semantic relations between sentences. Speakers of English know from the semantics of negation that nominal negation has different effects than sentence negation, so that ‘No-one complained’ may aptly be used to answer ‘Who complained?,’ while ‘Someone did not complain’ may not. Two sentences may seem to say essentially the same thing, even be paraphrases of each other, yet one may be more suited to one context than another–like the pair ‘Bandits looted the train’ and ‘The train was looted by bandits.’ A single sentence may be internally inconsistent, such as ‘Today is now tomorrow,’ or seem to be repetitive or redundant in meaning, such as ‘A capital city is a capital city.’ Another feature of sentence meaning is the regularity with which listeners draw inferences from sentences, and often take these to be part of the meaning of what was said. Some inferential links are very strong, such as entailment. Thus we say that ‘Bob drank all of the beer’ entails ‘Bob drank some of the beer’ (assuming the same individual Bob, beer, etc.), because it is hard to think of a situation where acceptance of the second sentence would not follow automatically from acceptance of the first. Other inferential links are weaker and more contextually dependent: from the utterance ‘Bob drank some of the beer’ it might be reasonable to infer ‘Bob didn’t drink all of the beer,’ but it is possible to think of situations where this inference would not hold. We might say that a speaker of the first sentence is implying the second, in a certain context. Speakers of all languages regularly predict and use such inferential behavior to convey their meaning, such that often more meaning seems to be communicated than is explicitly stated. All these aspects of sentence meaning are under study in various semantic frameworks. 13882
Semanticists share with philosophers an interest in key issues in the use of language, notably in reference. We use this term to describe the way in which speakers can pick out, or name, entities in the world by using words as symbols. Many scholars, especially formal semanticists, accept Frege’s distinction between reference (in German, Bedeutung) and sense (Sinn); see Frege (1980). Reference is the act of identifying an entity (the referent) while sense is the means of doing so. Two different linguistic expressions such as ‘the number after nine’ and ‘the number before eleven’ differ in sense but they both share the same referent, ‘ten.’ For semanticists it is particularly interesting to study the various mechanisms that a language offers to speakers for this act of referring. These include names such as ‘Dublin,’ nouns such as ‘cat,’ which can be used to refer to a single individual, ‘your cat,’ or a whole class, ‘Cats are carnivorous,’ quantified nominals such as ‘many cats,’ ‘some cats,’ ‘a few cats,’ etc. Linguists as well as philosophers have to account for language’s ability to allow us to refer to nonexistent and hypothetical referents such as ‘World War Three,’ ‘the still undiscovered cure for cancer,’ ‘the end of the world.’ Semanticists also share interests with psychologists, for if sense is the meaning of an expression, it seems natural to many semanticists to equate it with a conceptual representation. Cognitive semanticists, in particular, (for example Lakoff 1987, Talmy 2000), but also some generative linguists (Jackendoff 1996), seek to explore the relationship between semantic structure and conceptual structure. One axis of the debate is whether words, for example, are simply labels for concepts, or whether there is a need for an independent semantic interface that isolates just grammatically relevant elements of conceptual structure. As Jackendoff (1996) points out, many languages make grammatical distinctions corresponding to the conceptual distinctions of gender and number, but few involve distinctions of colour or between different animal species. If certain aspects of concepts are more relevant to grammatical rules, as is also claimed by Pinker (1989), this may be justification for a semantic interface.
2. Approaches to Meaning Even in these brief remarks we have had to touch on the crucial relationship between meaning and context. Language of course typically occurs in acts of communication, and linguists have to cope with the fact that utterances of the same words may communicate different meanings to different individuals in different contexts. One response to this problem is to hypothesize that linguistic units such as words, phrases, and sentences have an element of inherent meaning that does not vary across contexts. This is sometimes called inherent, or simply, sentence meaning. Language
Semantics users, for example speakers and listeners, then enrich this sentence meaning with contextual information to create the particular meaning the speaker means to convey at the specific time, which can then be called speaker meaning. One common way of reflecting this view is to divide the study of meaning into semantics, which becomes the study of sentence meaning, and pragmatics, which is then the study of speaker meaning, or how speakers use language in concrete situations. This is an attempt to deal with the tension between the relative predictability of language between fellow speakers and the great variability of individual interpretations in interactive contexts. One consequence of this approach is the view that the words that a speaker utters underdetermine their intended meaning. Semantics as a branch of linguistics is marked by the theoretical fragmentation of the field as a whole. The distinction between formal and functional approaches, for example, is as marked in semantics as elsewhere. This is a large subject to broach here but see Givo! n (1995) and Newmeyer (1998) for characteristic and somewhat antagonistic views. One important difference is the attitude to the autonomy of levels of analysis. Are semantics and syntax best treated as autonomous areas of study, each with its own characteristic entities and processes? A related question at a more general level is whether linguistic processes can be described independently of general psychological processes or the study of social interaction. Scholars in different theoretical frameworks will give contradictory answers to these questions of micro- and macroautonomy. Autonomy at both levels is characteristic of semantics within generative grammar; see, for example, Chomsky (1995). Functionalists such as Halliday (1996) and Harder (1996) would on the other hand argue against microautonomy, suggesting that grammatical relations and structure cannot be understood without reference to semantic function. They also seek motivation for linguistic structure in the dynamics of communicative interaction. A slightly different external mapping is characteristic of cognitive semantics, for example Lakoff (1987) and Langacker (1987), where semantic structures are correlated to conceptual structures. Another dividing issue in semantics is the value of formal representations. Scholars are divided on whether our knowledge of semantics is sufficiently mature to support attempts at mathematical or other symbolic modeling; indeed, on whether such modeling serves any use in this area. Partee (1996), for example, defends the view of formal semanticists that the application of symbolic logic to natural languages, following in particular the work of Montague (1974), represents a great advance in semantic description. Jackendoff (1990), on the other hand, acknowledges the value of formalism in semantic theory and description but argues that formal logic is too narrow adequately to describe meaning in language. Other
scholars, such as Wierzbicka (1992), view the search for formalism as premature and distracting. There has been an explosive increase in the research in formal semantics since Montague’s (1974) proposal that the analysis of formal languages could serve as the basis for the description of natural languages. Montague’s original theory comprised a syntax for the natural language, say English, a syntax for the logical language into which English should be translated (intensional logic), rules for the translation, and rules for the semantic interpretation of the intensional logic. This and subsequent formal approaches are typically referential (or denotational) in that their emphasis is on the connection of language with a set of possible worlds, including the real, external world and the hypothetical worlds set up by speakers. Crucial to this correspondence is the notion of truth, defined at the sentence level. A sentence is true if it correctly describes a situation in some world. In this view, the meaning of a sentence is characterized by describing the conditions which must hold for it to be true. The central task for such approaches is to extend the formal language to cope with the semantic features of natural language while maintaining the rigor and precision of the methodology. See the papers in Lappin (1996) for typical research in this paradigm. Research in cognitive semantics presents an alternative strategy. Cognitive semanticists reject what they see as the mathematical, antimentalist approach of formal semantics. In their view meaning is described by relating linguistic expressions to mental entities, conventionalized conceptual structures. These semanticists have proposed a number of conceptual structures and processes, many deriving from perception and bodily experience and, in particular, conceptual models of space. Proposals for underlying conceptual structures include image schemas (Johnson 1987), mental spaces (Fauconnier 1994), and conceptual spaces (Ga$ rdenfors 1999). Another focus of interest is the processes for extending concepts, and here special attention is given to metaphor. Lakoff (1987) and Johnson (1987) have argued against the classical view of metaphor and metonymy as something outside normal language, added as a kind of stylistic ornament. For these writers metaphor is an essential element in our categorization of the world and our thinking processes. Cognitive semanticists have also investigated the conceptual processes which reveal the importance of the speaker’s perspective and construal of a scene, including viewpoint shifting, figure-ground shifting, and profiling (Langacker 1987).
3. Topics in Sentence Semantics Many of the semantic systems of language, for example tense (see Tense, Aspect, and Mood, Linguistics of), aspect, mood, and negation, are marked 13883
Semantics grammatically on individual words such as verbs. However, they operate over the whole sentence. This ‘localization’ is the reason that descriptive grammars usually distribute semantic description over their analyses of grammatical forms. Such semantic systems offer the speaker a range of meaning distinctions through which to communicate a message. Theoretical semanticists attempt to characterize each system qua system, as in for example Verkuyl’s (1993) work on aspect and Hornstein’s (1990) work on tense. Typological linguists try to characterize the variation in such systems across the world’s languages, as in the studies of tense and aspect by Comrie (1976, 1985), Binnick (1991), and Bybee et al. (1994). We can sketch some basic features of some of these systems.
3.1 Situation Type and Aspect Situation type and aspect are terms for a language’s resources that allow a speaker to describe the temporal ‘shape’ of events. The term situation type is used to describe the system encoded in the words of a language, while aspect is used for the grammatical systems which perform a similar role. To take one example, languages typically allow speakers to describe a situation either as static, as in ‘The bananas are ripe,’ or as dynamic, as in ‘The bananas are ripening.’ Here the state is the result of the process but the same situation can be viewed as more static or dynamic, as in ‘The baby is asleep’ and ‘The baby is sleeping.’ As these examples show, this distinction is lexically marked: in English, for example, adjectives are typically used for states, and verbs for dynamic situations. There are, however, a group of stative verbs, such as ‘know,’ ‘understand,’ ‘love,’ ‘hate,’ which describe static situation types. There are a number of semantic distinctions typically found amongst dynamic verbs, for example the telic\atelic (bounded\unbounded) distinction and the punctual\ durative distinction. Telic verbs describe processes which are seen as having a natural completion, which atelic verbs do not. A telic example is ‘Matthew was growing up,’ and an atelic example is ‘Matthew was drinking.’ If these procsses are interrupted at any point, we can automatically say ‘Matthew drank,’ but not ‘Matthew grew up.’ However, atelic verbs can form telic phrases and sentences by combining with other grammatical elements, so that ‘Matthew was drinking a pint of beer’ is telic. Durative verbs, as the term suggests, describe processes that last for a period of time, while punctual describes those that seem so instantaneous that they have no detectable internal structure, as in the comparison between ‘The man slept’ and ‘The light flashed.’ As has often been observed, if an English punctual verb is used with a durative adverbial, the result is an iterative meaning, as in ‘The light flashed all night,’ where we understand the event to be repeated over the time mentioned. 13884
Situation type typically interacts with aspect. Aspect is the grammatical system that allows the speaker choices in how to portray the internal temporal nature of a situation. An event, for example, may be viewed as closed and completed, as in ‘Joan wrote a book,’ or as an ongoing process, perhaps unfinished, as in ‘Joan was writing a book.’ The latter verb form is described as being in the progressive aspect in English, but similar distinctions are very common in the languages of the world. In many languages we find described a distinction between perfective and imperfective aspects, used to describe complete versus incomplete events; see Bybee et al. (1994) for a survey. As mentioned above, aspect is intimately associated both with situation type and tense. In Classical Arabic the perfective is strongly associated with past tense (Comrie 1976, Binnick 1991). In English, for example, stative verbs are typically not used with progressive aspect, so that one may say ‘I know some French’ but not ‘I am knowing some French.’ Staying with the progressive, when it is used in the present tense in English (and in many other languages) it carries a meaning of proximate future or confident prediction as in ‘We’re driving to Los Angeles’ or ‘I’m leaving you.’ The combination of the three semantic categories of tense, situation type, and aspect produces a complex system that allows speakers to make subtle distinctions in relating an event or describing a situation.
3.2 Modality Modality is a semantic system that allows speakers to express varying attitudes to a proposition. Semanticists have traditionally identified two types of modality. One is termed epistemic modality, which encodes a speaker’s commitment to, or belief in, a proposition, from the certainty of ‘The ozone layer is shrinking’ to the weaker commitments of ‘The ozone layer may\might\could be shrinking.’ The second is deontic modality, where the speaker signals a judgment toward social factors of obligation, responsibility, and permission, as in the various interpretations of ‘You must\can\ may\ ought to borrow this book.’ These examples show that similar markers, here auxiliary verbs, can be used for both types. When modality distinctions are marked by particular verbal forms, these are traditionally called moods. Thus many languages, including Classical Greek and Somali, have a verb form labeled the optative mood for expressing wishes and desires. Other markers of modality in English include verbs of propositional attitude, as in ‘I know\believe\think\doubt\ that the ozone layer is shrinking,’ and modal adjectives, as in ‘It is certain\probable\likely\possible that the ozone layer is shrinking.’ A related semantic system is evidentiality, where a speaker communicates the basis or source for present-
Semantics ing a proposition. In English and many other languages this may be done by adding expressions like ‘allegedly,’ ‘so I’ve heard,’ ‘they say,’ etc., but certain languages mark such differences morphologically, as in Makah, a Nootkan language spoken in Washington State (Jacobsen 1986, p. 10): wiki:caxaw: ‘It’s bad weather’ (seen or experienced directly); wiki:caxakpi:d: ‘It looks like bad weather’ (inference from physical evidence); wiki:caxakqad\I: ‘It sounds like bad weather’ (on the evidence of hearing); and wiki:caxakwa:d: ‘I’m told there’s bad weather’ (quoting someone else). 3.3 Semantic Roles This term describes the speaker’s semantic repertoire for relating participants in a described event. One influential proposal in the semantics literature is that each language contains a set of semantic roles, the choice of which is partly determined by the lexical semantics of the verb selected by the speaker. A characteristic list of such roles is: agent: the initiator of some action, capable of acting with volition; patient: the entity undergoing the effect of some action, often undergoing some change in state; theme: the entity which is moved by an action, or whose location is described; experiencer: the entity which is aware of the action or state described by the predicate but which is not in control of the action or state; beneficiary: the entity for whose benefit the action was performed; instrument: the means by which an action is performed or something comes about; location: the place in which something is situated or takes place; goal: the entity towards which something moves; recipient: the entity which receives something; and source: the entity from which something moves. In an example like ‘Harry immobilized the tank with a broomstick,’ the entity Harry is described as the agent, the tank as the patient, and the broomstick as the instrument. These roles have also variously been called deep semantic cases, thematic relations, participant roles, and thematic roles. One concern is to explain the matching between semantic roles and grammatical relations. In many languages, as in the last example, there is a tendency for the subject of the sentence to correspond to the agent and for the direct object to correspond to a patient or theme; an instrument often occurs as a prepositional phrase. Certain verbs allow variations from this basic mapping, for example the we find with English verbs such as ‘break’: ‘The boy broke the window with a stone’ (subject l agent); ‘The stone broke the window’ (subject l instrument); ‘The win-
dow broke’ (subject l patient). Clearly verbs can be arranged into classes depending on the variations of mappings they allow, and not all English verbs pattern like ‘break.’ We can say ‘The admiral watched the battle with a telescope,’ but ‘The telescope watched the battle’ and ‘The battle watched’ sound decidedly odd. From this literature emerges the claim that certain mappings are more natural or universal. One proposal is that, for example, there is an implicational hierarchy governing the mapping to subject, typically such as: agentrecipient\benefactivetheme\patientinstrumentlocation. In such a hierarchy each left element is more preferred than its right neighbor, so that moving rightward along the string gives us fewer expected subjects. The hierarchy also makes certain typological claims: if a language allows a certain semantic role to be subject, it will allow all those to its left. Thus if we find that a language allows the role instrument to be subject, we predict that it allows the roles to the left, but we do not know if it allows location subjects. One further application of semantic roles is in lexical semantics, where the notion allows verbs to be classified by their semantic argument structure. Verbs are assigned semantic role templates or grids by which they may be sorted into natural classes. Thus, English has a class of transfer, or giving verbs, which in one type includes the verbs ‘give,’ ‘lend,’ ‘supply,’ ‘pay,’ ‘donate,’ ‘contribute.’ These verbs encode a view of the transfer from the perspective of the agent and may be assigned the pattern agent, theme, recipient, as in ‘The committee donated aid to the famine victims.’ A second subclass of these transfer verbs encodes the process from the perspective of the recipient. These verbs include ‘receive,’ ‘accept,’ ‘borrow,’ ‘buy,’ ‘purchase,’ ‘rent,’ ‘hire,’ and have the pattern recipient, theme, source, as in ‘The victims received aid from the committee.’ 3.4 Entailment, Presupposition, and Implication These terms relate to types of information a hearer gains from an utterance but which are not stated directly by the speaker. These phenomena have received a lot of attention because they seem to straddle the putative divide between semantics and pragmatics described above, and because they reveal the dynamic and interactive nature of understanding the meaning of utterances. Entailment describes a relationship between sentences such that on the basis of one sentence, a hearer will accept a second, unstated sentence purely on the basis of the meaning of the first. Thus sentence A entails sentence B, if it is not possible to accept A but reject B. In this view a sentence such as ‘I bought a dog today’ entails ‘I bought an animal today’; or ‘President Kennedy was assassinated yesterday’ entails ‘President Kennedy is now dead.’ Clearly these sentential relations depend on lexical relations: a speaker who understands the meaning of 13885
Semantics the English word ‘dog’ knows that a dog is an animal; similarly the verb ‘assassinate’ necessarily involves the death of the unfortunate object argument. Entailment then is seen as a purely automatic process, involving no reasoning or deduction, but following from the hearer’s linguistic knowledge. Entailment is amenable to characterization by truth conditions. A sentence is said to entail another if the truth of the first guarantees the truth of the second, and the falsity of the second guarantees the falsity of the first. Presupposition, on the other hand, is a more complicated notion. In basic terms, the idea is simple enough: that a speaker communicates certain assumptions aside from the main message. A range of linguistic elements communicates these assumptions. Some, such as names, and definiteness markers such as the articles ‘the’ and ‘my,’ presuppose the existence of entities. Thus ‘James Brown is in town’ presupposes the existence of a person so called. Other elements have more specific presuppositions. A verb such as ‘stop’ presupposes a preexisting situation. So a sentence ‘Christopher has stopped smoking’ presupposes ‘Christopher smoked.’ If treated as a truth-conditional relation, presupposition is distinguished from entailment by the fact that it survives under negation: ‘Christopher has not stopped smoking’ still presupposes ‘Christopher smoked,’ but the sentence ‘I didn’t buy a dog today’ does not entail ‘I bought an animal today.’ There are a number of other differences between entailment and presupposition that cast doubts on the ability of a purely semantic, truth-conditional account of the latter. Presuppositions are notoriously context sensitive, for example. They may be cancelled without causing an anomaly: a hearer can reply ‘Christopher hasn’t stopped smoking, because he never smoked’ to cancel the presupposition by what is sometimes called metalinguistic negation. This dependency on context has led some writers to propose that presupposition is a pragmatic notion, definable in terms of the set of background assumptions that the speaker assumes is shared in the conversation. See Beaver (1997) for discussion. A third type of inference is Grice’s conversational implicature (1975, 1978). This is an extremely contextsensitive type of inference which allow participants in a conversation to maintain coherence. So, given the invented exchange below, A: Did you give Mary the book? B: I haven’t seen her yet. It is reasonable for A to infer the answer ‘no’ to her question. Grice proposed that such inferences are routinely relied on by both speakers and hearers, and that this reliance is based on certain assumptions that hearers make about a speaker’s conduct. Grice classified these into several different types, giving rise to different types of inference, or, from the speaker’s point of view, what he termed implicatures. The four main maxims are called Quality, Quantity, 13886
Relevance, and Manner (Grice 1975, 1978). They amount to a claim that a listener will assume, unless there is evidence to the contrary, that a speaker will have calculated their utterance along a number of parameters: they will tell the truth, try to estimate what their audience knows, and package their material accordingly, have some idea of the current topic, and give some thought to their audience being able to understand them. In our example above, it is A’s assumption that B’s reply is intended to be relevant that allows the inference ‘no.’ Implicature has three characteristics: first, that it is implied rather than said; second, that its existence is a result of the context i.e., the specific interaction. There is no guarantee that in other contexts ‘I haven’t seen her’ will be used to communicate ‘no.’ Third, implicature is cancelable without causing a contradiction. Thus the implicature ‘no’ in our example can be cancelled if B adds the clause ‘but I mailed it to her last week.’ These three notions—entailment, presupposition, and implicature—can all be seen as types of inference. They are all produced in conversation, and are taken by participants to be part of the meaning of what a speaker has said. They differ in a number of features and crucially in context sensitivity. The attempt to provide a unified analysis of them all is a challenge to semantic and pragmatic theories. See Sperber and Wilson (1995) for an attempt at such a unified approach.
4. Future Deelopments Although semantics remains theoretically a very diverse field it is possible to detect some shared trends which seem likely to develop further. One is a move away from a static view of sentences in isolation, detached from the speaker\writer’s act of communication, toward dynamic, discourse-based approaches. This has always been characteristic of functional approaches to meaning but has also been noticeable in formal approaches as they move away from their more philosophical origins. Among examples of this we might mention discourse representation theory (Kamp and Reyle 1993) and dynamic semantics (Groenendijk et al. 1996). Another development which seems likely to continue is a closer integration with other disciplines in cognitive science. In particular, computational techniques seem certain to make further impact on a range of semantic inquiry, from lexicography to the modeling of questions and other forms of dialogue. A subfield of computational semantics has emerged and will continue to develop; see Rosner and Johnson (1992) for example. See also: Etymology; Lexical Processes (Word Knowledge): Psychological and Neural Aspects; Lexical
Semiotics Semantics; Lexicology and Lexicography; Lexicon; Semantic Knowledge: Neural Basis of; Semantic Processing: Statistical Approaches; Semantic Similarity, Cognitive Psychology of; Word Meaning: Psychological Aspects
Bibliography Beaver D 1997 Presuppositions. In: Van Bentham J, Ter Meulen A (eds.). Handbook of Logic and Language. Elsevier, Amsterdam, pp. 939–1008 Binnick R I 1991 Time and the Verb: A Guide to Tense and Aspect. Oxford University Press, Oxford, UK Bybee J, Perkins R, Pagliuca W 1994 The Eolution of Grammar: Tense, Aspect, and Modality in the Languages of the World. University of Chicago Press, Chicago Chomsky N 1995 The Minimalist Program. MIT Press, Cambridge, MA Comrie B 1976 Aspect: An Introduction to the Study of Verbal Aspect and Related Problems. Cambridge University Press, Cambridge, UK Comrie B 1985 Tense. Cambridge University Press, Cambridge, UK Fauconnier G 1994 Mental Spaces: Aspects of Meaning Construction in Natural Language. Cambridge University Press, Cambridge, UK Fisiak J (ed.) 1985 Historical Semantics—Historical Word Formation. Mouton de Gruyter, Berlin Frege G 1980 Translations from the Philosophical Writings of Gottlob Frege [ed. Geach P, Black M]. Blackwell, Oxford, UK Ga$ rdenfors P 1999 Some tenets of cognitive semantics. In: Allwood J, Ga$ rdenfors P (eds.). Cognitie Semantics: Meaning and Cognition. John Benjamins, Amsterdam, pp. 12–36 Givo! n T 1995 Functionalism and Grammar. John Benjamins, Amsterdam Grice H P 1975 Logic and conversation. In: Cole P, Morgan J (eds.). Syntax and Semantics, Speech Acts. Academic Press, New York, Vol. 3 pp. 43–58 Grice H P 1978 Further notes on logic and conversation. In: Cole P (ed.). Syntax and Semantics 9: Pragmatics. Academic Press, New York, pp. 113–28 Groenendijk J, Stokhof M, Veltman F 1996 Coreference and modality. In: Lappin S (ed.). The Handbook of Contemporary Semantic Theory. Blackwell, Oxford, UK, pp. 179–214 Halliday M A K 1994 An Introduction to Functional Grammar, 2nd edn. Edward Arnold, London Harder P 1996 Functional Semantics: A Theory of Meaning, Structure and Tense in English. Mouton de Gruyter, Berlin Hornstein N 1990 As Time Goes By: Tense and Uniersal Grammer. MIT Press, Cambridge, MA Jackendoff R 1990 Semantic Structures. MIT Press, Cambridge, MA Jackendoff R 1996 Semantics and cognition. In: Lappin S (ed.). The Handbook of Contemporary Semantic Theory. Blackwell, Oxford, UK, pp. 539–60 Jacobsen W H Jr 1986 The heterogeneity of evidentials in Makah. In: Chafe W, Nichols J (eds.). Eidentiality: The Linguistic Coding of Epistemology. Ablex, Norwood, NJ, pp. 3–28 Johnson M 1987 The Body in the Mind: The Bodily Basis of Meaning, Imagination, and Reason. University of Chicago Press, Chicago Kamp H, Reyle U 1993 From Discourse to Logic. Kluwer, Dordrecht, The Netherlands
Lakoff G 1987 Women, Fire, and Dangerous Things: What Categories Reeal About the Mind. University of Chicago Press, Chicago Langacker R W 1987 Foundations of Cognitie Grammar. Stanford University Press, Stanford, CA Lappin S (ed.) 1996 The Handbook of Contemporary Semantic Theory. Blackwell, Oxford, UK Lehmann W P 1992 Historical Linguistics, 3rd edn. Routledge, London Montague R 1974 Formal Philosophy: Selected Papers of Richard Montague [ed. Thomason R H]. Yale University Press, New Haven, CT Newmeyer F J 1998 Language Form and Language Function. MIT Press, Cambridge, MA Partee B H 1996 The development of formal semantics in linguistic theory. In: Lappin S (ed.). The Handbook of Contemporary Semantic Theory. Blackwell, Oxford, UK, pp. 11–38 Pinker S 1989 Learnability and Cognition: The Acquisition of Argument Structure. MIT Press, Cambridge, MA Rosner M, Johnson R (eds.) 1992 Computational Linguistics and Formal Semantics. Cambridge University Press, Cambridge, UK Sperber D, Wilson D 1995 Releance: Communication and Cognition, 2nd edn. Blackwell, Oxford, UK Talmy L 2000 Toward a Cognitie Semantics. MIT Press, Cambridge, MA Verkuyl H J 1993 A Theory of Aspectuality: The Interaction Between Temporal and Atemporal Structure. Cambridge University Press, Cambridge, UK Wierzbicka A 1992 Semantics, Culture, and Cognition. Uniersal Concepts in Culture-specific Configurations. Oxford University Press, Oxford, UK
J. I. Saeed
Semiotics Semiotics is an interdisciplinary field that studies ‘the life of signs within society’ (Saussure 1959, p. 16). While ‘signs’ most commonly refers to the elements of verbal language and other vehicles of communication, it also denotes any means of representing or knowing about an aspect of reality. As a result, semiotics has developed as a close cousin of such traditional disciplines as philosophy and psychology. In the social sciences and humanities, semiotics has become an influential approach to research on culture and communication particularly since the 1960s. This article describes the classical origins of semiotics, exemplifies its application to contemporary culture, and outlines its implications for the theory of science.
1. Origins: Logic and Linguistics Two different senses of ‘signs’ can be traced in the works of Aristotle (Clarke 1990, p. 15) (see Aristotle (384–322 BC)). First, the mental impressions that 13887
Semiotics people have, are signs which represent certain objects in the world to them. Second, spoken and written expressions are signs with which people are able to represent and communicate a particular understanding of these objects to others. The first sense, of mental impressions, points to the classical understanding of signs, not as words or images, but as naturally occurring evidence that something is the case. For example, fever is a sign of illness, and clouds are a sign that it may rain, to the extent that these signs are interpreted by humans. The second sense of signs as means of communication points towards the distinction that came to inform most modern science between conventional signs, especially verbal language, and sense data, or natural signs. Modern natural scientists could be said to study natural signs with reference to specialized conventional signs. Given the ambition of semiotics to address the fundamental conditions of human knowledge, it has been an important undercurrent in the history of science and ideas. The first explicit statement regarding natural signs and language as two varieties of one general category of signs came not from Aristotle, but from St. Augustine (c. AD 400). This position was elaborated throughout the medieval period with reference, in part, to the understanding of nature as ‘God’s Book’ and an analogy to the Bible. However, it was not until the seventeenth century that the term semiotic emerged in John Locke’s An Essay concerning Human Understanding (1690). Here, Locke proposed a science of signs in general, only to restrict his own focus to ‘the most usual’ signs, namely, verbal language and logic (Clarke 1990, p. 40).
1.1 Peircean Logic and Semiotics Charles Sanders Peirce (1839–1914) was the first thinker to recover this undercurrent in an attempt to develop a general semiotic. He understood his theory of signs as a form of logic, which informed a comprehensive system for understanding the nature of being and of knowledge. The key to the system is Peirce’s definition of the sign as having three aspects: A sign, or representamen, is something which stands to somebody for something in some respect or capacity. It addresses somebody, that is, creates in the mind of that person an equivalent sign, or perhaps a more developed sign. That sign which it creates I call the interpretant of the first sign. That sign stands for something, its object.
An important implication of this definition is that signs always serve to mediate between objects in the world, including social facts, and concepts in the mind. Contrary to various skepticist positions, from Antiquity to postmodernism, Peirce’s point was not that signs are what we know, but how we come to know what we can justify saying that we know, in science 13888
Figure 1 The process of semiosis
and in everyday life. Peircean semiotics married a classical, Aristotelian notion of realism with the modern, Kantian position (see Kant, Immanuel (1724–1804)) that humans necessarily construct their understanding of reality in the form of particular cognitive categories. A further implication is that human understanding is not a singular internalization of reality, but a continuous process of interpretation, what is called semiosis. This is witnessed, for example, in the process of scientific discovery, but also in the ongoing coordination of social life. Figure 1 illustrates the process of semiosis, noting how any given interpretation (interpretant) itself serves as a sign in the next stage of an unending process which differentiates the understanding of objects in reality. Although Peirce’s outlook was that of a logician and a natural scientist, semiosis can be taken to refer to the processes of communication by which cultures are maintained and societies reproduced and, to a degree, reformed. One of the most influential elements of Peirce’s semiotics outside logic and philosophy has been his categorization of different types of signs, particularly icon, index, and symbol. Icons relate to their objects through resemblance (e.g., a realistic painting); indices have a causal relation (e.g., fever as a symptom of illness); and symbols have an arbitrary relation to their object (e.g., words). Different disciplines and fields have relied on these types as analytical instruments in
Semiotics order to describe the ways in which humans perceive and act upon reality.
1.2 Saussurean Linguistics and Semiology Compared with Peirce, the other main figure in the development of semiotics, Ferdinand de Saussure (1857–1913), placed a specific focus on verbal language (see Saussure, Ferdinand de (1857–1913)), what Peirce had referred to as symbols. Probably the main achievement of Saussure was to outline the framework for modern linguistics (see Linguistics: Oeriew). In contrast to the emphasis that earlier philology had placed on the diachronic perspective of how languages change over time, Saussure proposed to study language as a system in a synchronic perspective. Language as an abstract system (langue) could be distinguished, at least for analytical purposes, from the actual uses of language (parole). The language system has two dimensions. Along the syntagmatic dimension, letters, words, phrases, etc., are the units that combine to make up meaningful wholes, and each of these units has been chosen as one of several possibilities along a paradigmatic dimension, for example, one verb in preference to another. This combinatory system helps to account for the remarkable flexibility of language as a medium of social interaction. To be precise, Saussure referred to the broader science of signs, as cited in the introduction to this article, not as semiotics, but as semiology. Like Locke, and later Peirce, Saussure relied on classical Greek to coin a technical term, and the two variants are explained, in part, by the fact that Peirce and Saussure had no knowledge of each other’s work. Moreover, in Saussure’s own work, the program for a semiology remained undeveloped, appearing almost as an aside from his main enterprise of linguistics. It was during the consolidation of semiotics as an interdisciplinary field from the 1960s that this became the agreed term, overriding Peirce’s ‘semiotic’ and Saussure’s ‘semiology’, as symbolized by the formation in 1969 of the International Association for Semiotic Studies. It was also during this period that Saussure’s systemic approach was redeveloped to apply to culture and society. One important legacy of Saussure has been his account of the arbitrariness of the linguistic sign. This sign type is said to have two sides, a signified (concept) and a signifier (the acoustic image associated with it), whose relation is conventional or arbitrary. While this argument is sometimes construed in skepticist and relativist terms, as if sign users were paradoxically free to choose their own meanings, and hence almost destined to remain divorced from any consensual reality, the point is rather that the linguistic system as a whole, including the interrelations between signs and
their components, is arbitrary, but fixed by social convention. When applied to studies of cultural forms other than language, or when extended to the description of social structures as signs, the principle of arbitrariness has been a source of analytical difficulties.
2. Applications: Media, Communication and Culture It was the application of semiotics to questions of meaning beyond logic and linguistics which served to consolidate semiotics as a recognizable, if still heterogeneous field from the 1960s. Drawing on other traditions of aesthetic and social research, this emerging field began to make concrete what the study of ‘the life of signs within society’ might mean. The objects of analysis ranged from artworks to mass media to the life forms of premodern societies, but studies were united by a common interest in culture in the broad sense of worldviews that orient social action. The work of Claude Le! vi-Strauss (1963) on structural anthropology was characteristic and highly influential of such research on shared, underlying systems of interpretation which might explain the viewpoints or actions of individuals in a given context. It was Saussure’s emphasis on signs as a system, then, which provided a model for structuralist theories (see Structuralism) about social power (e.g., Louis Althusser) and about the unconscious as a ‘language’ (e.g., Jacques Lacan) (see Psychoanalysis: Oeriew). Of the French scholars who were especially instrumental in this consolidation, Roland Barthes, along with A. J. Greimas, stands out as an innovative and systematic theorist. His model of two levels of signification, reproduced in Fig. 2, has been one of the most widely copied attempts to link concrete sign vehicles, such as texts and images, with the ‘myths’ or ideologies which they articulate. Building on Louis Hjelmslev’s formal linguistics, Barthes (1973) suggested that the combined signifier and signified (expressive form and conceptual content) of one sign (e.g., a picture of a black man in a French uniform saluting the flag) may become the expressive form of a
Language MYTH
((
1. Signifier 2. Signified 3. Sign I SIGNIFIER
II SIGNIFIED
III SIGN
Figure 2 Two Levels of Signification (reproduced by permission of Roland Barthes [Random House Group Ltd (Jonathan Cape)] from Mythologies, p. 115, first published in French by de Seuil in 1957)
13889
Semiotics further, ideological content (e.g., that French imperialism is not a discriminatory or oppressive system). Barthes’ political agenda, shared by much semiotic scholarship since then, was that this semiotic mechanism serves to naturalize particular worldviews, while oppressing others, and should be deconstructed (see Critical Theory: Contemporary). In retrospect, one can distinguish two ways of appropriating semiotics in social and cultural research. On the one hand, semiotics may be treated as a methodology for examining signs, whose social or philosophical implications are interpreted with reference to another set of theoretical concepts. On the other hand, semiotics may supply also the theoretical framework, so that societies, psyches, and images are understood not just in procedural, but also in actual conceptual terms as signs. The latter position is compatible with a variant of semiotics, which assumes that also biological and even cosmological processes are best understood as semioses. This ambitious and arguably imperialistic extension of logical and linguistic concepts into other fields has encountered criticism, for example, in cases where Saussure’s principle of arbitrariness has been taken to apply to visual communication or to the political and economic organization of society. A semioticization of, for instance, audiovisual media (for example, see Visual Images in the Media) can be said to neglect their appeal to radically different perceptual registers, which are, in certain respects, natural rather than conventional (e.g., Messaris 1994) (for example, see Cognitie Science: Oeriew). Similarly, a de facto replacement of social science by semiotics may exaggerate the extent to which signs, rather than material and institutional conditions, determine the course of society, perhaps in response to semioticians’ political ambition of making a social difference by deconstructing signs. In recent decades, these concerns, along with critiques of Saussurean formalist systems, have led to attempts to formulate a social semiotics, integrating semiotic methodology with other social and communication theory (e.g., Hodge and Kress 1988, Jensen 1995). On the relations between semiotics and other fields, see No$ th 1990. A final distinction within semiotic studies of society and culture arises from the question of what is a ‘medium’. Media research has drawn substantially on semiotics to account for the distinctive sign types, codes, narratives and modes of address of different media such as newspapers, television or the Internet (for example, see Mass Media: Introduction and Schools of Thought). A particular challenge has been the film medium, which was described by Christian Metz (1974) not as a language system, but as a ‘language’ drawing on several semiotic codes. Beyond this approach to media as commonly understood, semioticians, taking their inspiration from Le! viStrauss’ anthropology, have described other objects and artifacts as vehicles of meaning. One rich example 13890
is Barthes’ study of the structure of fashion (see Fashion, Sociology of ), as depicted in magazines (Barthes 1985). The double meaning of a ‘medium’ is another indication both of the potential theoretical richness of semiotics and of the pitfalls of confusing different levels of analysis.
3. Theory of Signs and Theory of Science It is this double edge which most likely explains the continuing influence of semiotic ideas since Antiquity, mostly, however, as an undercurrent of science. The understanding articulated in Aristotle’s writings of signs as evidence for an interpreter of something else which is at least temporarily absent, or as an interface which is not identical with the thing that is in evidence to the interpreter, may be taken as a cultural leap that has made possible scientific reflexivity as well as social organization beyond the here and now. Further, the sign concept has promoted the crucial distinction beginning with Aristotle between necessary and probable relations, as in logical inferences. Returning to the examples above, fever is a sure sign that the person is ill, since illness is a necessary condition of fever. But clouds are only a probable sign of rain. Empirical social research is mostly built on the premise that probable signs can lead to warranted inferences. The link between a theory of signs and a theory of science has been explored primarily from a Peircean perspective. For one thing, Peirce himself made important contributions to logic and to the theory of science generally, notably by adding to deduction and induction the form of inference called abduction, which he detected both in everyday reasoning and at the core of scientific innovation. For another thing, the wider philosophical tradition of pragmatism (for example, see Pragmatism: Philosophical Aspects), which Peirce inaugurated, has emphasized the interrelations between knowledge, education, and action, and it has recently enjoyed a renaissance at the juncture between theory of science and social theory (e.g., Bernstein 1991, Joas 1993). By contrast, the Saussurean tradition has often bracketed issues concerning the relationship between the word and the world (Jakobson 1981, p. 19), even if critical research in this vein has recruited theories from other schools of thought in order to move semiology beyond the internal analysis of signs. For future research, semiotics is likely to remain a source of inspiration in various scientific domains, as it has been for almost 2500 years. In light of the record from the past century of intensified interest in theories of signs, however, the field is less likely to develop into a coherent discipline. One of the main opportunities for semiotics may be to develop a meta-framework for understanding how different disciplines and fields conceive of their ‘data’ and ‘concepts’. The semiotic heritage offers both systematic analytical procedures
Semiparametric Models and a means of reflexivity regarding the role of signs in society as well as in the social sciences. See also: Communication: Philosophical Aspects
Bibliography Barthes R 1973 Mythologies. Paladin, London Barthes R 1985 The Fashion System. Cape, London Bernstein R J 1991 The New Constellation. Polity Press, Cambridge, UK Bouissac P 1998 (ed.) Encyclopedia of Semiotics. Oxford University Press, New York Clarke Jr. D S 1990 Sources of Semiotic. Southern Illinois University Press, Carbondale, IL Greimas A J, Courte! s J 1982 Semiotics and Language: An Analytical Dictionary. Indiana University Press, Bloomington, IN Hodge R and Kress G 1988 Social Semiotics. Polity Press, Cambridge, UK Jakobson R 1981 Linguistics and poetics. In: Selected Writings. Mouton, The Hague, The Netherlands, Vol. 3 Joas H 1993 Pragmatism and Social Theory. University of Chicago Press, Chicago, IL Jensen K B 1995 The Social Semiotics of Mass Communication. Sage, London Le! vi-Strauss C 1963 Structural Anthropology. Basic Books, New York Messaris P 1994 Visual ‘Literacy’: Image, Mind, and Reality. Westview Press, Boulder, CO Metz C 1974 Language and Cinema. Mouton, The Hague, The Netherlands No$ th W 1990 Handbook of Semiotics. Indiana University Press, Bloomington, IN Peirce C S 1982 Writings of Charles S. Peirce: A Chronological Edition. Indiana University Press, Bloomington, IN Peirce C S 1992–98 The Essential Peirce. Indiana University Press, Bloomington, IN, Vols 1–2 Posner R, Robering K, Sebeok T A (eds.) 1997–98 Semiotik: Ein Handbuch zu den zeichentheoretischen Grundlagen on Natur und Kultur\Semiotics: A Handbook on the Sign-Theoretic Foundations of Nature and Culture. Walter de Gruyter, Berlin, Vols. 1–2 de Saussure F 1959 Course in General Linguistics. Peter Owen, London Sebeok T A 1994 Encyclopedic Dictionary of Semiotics, 2nd edn. Mouton de Gruyter, Berlin, Vols 1–3
K. B. Jensen
Semiparametric Models Much empirical research in the social sciences is concerned with estimating conditional mean functions. For example, labor economists are interested in estimating the mean wages of employed individuals, conditional on characteristics such as years of work
experience and education. The most frequently used estimation methods assume that the conditional mean function is known up to a set of constant parameters that can be estimated from data, possibly by ordinary least squares. Models in which the only unknown quantities are a finite set of constant parameters are called ‘parametric.’ The use of a parametric model greatly simplifies estimation, statistical inference, and interpretation of the estimation results but is rarely justified by theoretical or other a priori considerations. Estimation and inference based on convenient but incorrect assumptions about the form of the conditional mean function can be highly misleading. Semiparametric statistical methods reduce the strength of the assumptions required for estimation and inference, thereby reducing the opportunities for obtaining misleading results. These methods are applicable to a wide variety of estimation problems in economics and other fields.
1. Introduction A conditional mean function gives the mean of a dependent variable Y conditional on a vector of explanatory variables X. Denote the mean of Y conditional on X l x by E(Y Q x). For example, suppose that Y is a worker’s weekly wage (or, more often in applied econometrics, the logarithm of the wage) and X includes such variables as years of work experience and education, race, and sex. Then E(Y Q x) is the mean wage (or logarithm of the wage) when experience and the other explanatory variables have the values specified by x. As an illustration, the solid line in Fig. 1 shows an estimate of the mean of the logarithm of weekly wages, log W, conditional on years of work experience, EXP, for white males with 12 years of education who work full time and live in urban areas of the North Central USA. The estimate was obtained by applying a nonparametric method (explained in Sect. 2) to data from the 1993 Current Population Survey (CPS). The estimated conditional mean of log W increases steadily up to approximately 30 years of experience and is flat thereafter. In most applications, E(Y Q x) is unknown and must be estimated from data on the variables of interest. In the case of estimating a wage function, the data consist of observations of individuals’ wages, years of experience, and other characteristics. The most widely used method for estimating E(Y Q x) is not the nonparametric method mentioned previously, but rather a method that assumes that E(Y Q x) is known up to finitely many constant parameters. This gives a ‘parametric model’ for E(Y Q x). Often, E(Y Q x) is assumed to be a linear function of x, in which case the parameters can be estimated by ordinary least squares (OLS), among other ways. A linear function has the form E(Y Q x) l βhx, where β is a vector of coefficients. For 13891
Semiparametric Models
Figure 1 Estimates of E(log W Q EXP)
example, if x consists of an intercept and the two variables x and x , then β has three components, and βhx l β jβ" x jβ# x . OLS estimators are described in ! " " See, # # for example, Goldberger (1998). many textbooks. The OLS estimator of E(Y Q x) can be highly misleading if E(Y Q x) is not linear in the components of x, that is if there is no β such that E(Y Q x) l βhx. This problem is illustrated by the dashed and dotted lines in Fig. 1, which show two parametric estimates of the mean of the logarithm of weekly wages conditional on years of work experience. The dashed line is the OLS estimate that is obtained by assuming that E(log W Q EXP) is the linear function E(log W Q EXP) l β jβ EXP. The dotted line is the OLS estimate that is ! " by assuming that E(log W Q EXP) is quadraobtained tic: E(log W Q EXP) l β jβ EXPjβ EXP#. The non! line) " places #no restrictions on parametric estimate (solid the shape of E(log W Q EXP). The linear and quadratic models give misleading estimates of E(log W Q EXP). The linear model indicates that E(log W Q EXP) steadily increases as experience increases. The quadratic model indicates that E(log W Q EXP) decreases after 32 years of experience. In contrast, the nonparametric estimate of E(log W Q EXP) becomes nearly flat at approximately 30 years of experience. Because the nonparametric estimate does not restrict the conditional mean function to be linear or quadratic, it is more likely to represent the true conditional mean function. The opportunities for specification error increase if Y is binary. For example, consider a model of the choice of travel mode for the trip to work. Suppose that the available modes are automobile and transit. Let Y l 1 if an individual chooses automobile and Y l 0 if the individual chooses transit. Let X be a vec13892
tor of explanatory variables such as the travel times and costs by automobile and transit. Then E(Y Q x) is the probability that Y l 1 (the probability that the individual chooses automobile) conditional on X l x. This probability will be denoted P(Y l 1 Q x). In applications of binary response models, it is often assumed that P(Y Q x) l G(βhx), where β is a vector of constant coefficients and G is a known probability distribution function. Often, G is assumed to be the cumulative standard normal distribution function, which yields a ‘binary probit’ model, or the cumulative logistic distribution function, which yields a ‘binary logit’ model (see Multiariate Analysis: Discrete Variables (Logistic Regression)). The coefficients β can then be estimated by the method of maximum likelihood (Amemiya 1985). However, there are now two potential sources of specification error. First, the dependence of Y on x may not be through the linear index βhx. Second, even if the index βhx is correct, the ‘response function’ G may not be the normal or logistic distribution function. See Horowitz (1993, 1998) for examples of specification errors in binary response models and their consequences. Many investigators attempt to minimize the risk of specification error by carrying out a ‘specification search’ in which several different models are estimated and conclusions are based on the one that appears to fit the data best. Specification searches may be unavoidable in some applications, but they have many undesirable properties and their use should be minimized. There is no guarantee that a specification search will include the correct model or a good approximation to it. If the search includes the correct model, there is no guarantee that it will be selected by the investigator’s model selection criteria. Moreover, the
Semiparametric Models search process invalidates the statistical theory on which inference is based. The rest of this entry describes methods that deal with the problem of specification error by relaxing the assumptions about functional form that are made by parametric models. The possibility of specification error can be essentially eliminated through the use of nonparametric estimation methods. These are described in Sect. 2. They assume that E(Y Q x) is a smooth function but make no other assumptions about its shape or functional form. However, nonparametric methods have important disadvantages that seriously limit their usefulness in applications. Semiparametric methods, which are described in Sect. 3 offer a compromise. They make assumptions about functional form that are stronger than those of a nonparametric model but less restrictive than the assumptions of a parametric model, thereby reducing (though not eliminating) the possibility of specification error. In addition semiparametric methods avoid the most serious practical disadvantages of nonparametric methods.
2. Nonparametric Models In nonparametric estimation E(Y Q x) is assumed to satisfy smoothness conditions such as differentiability, but no assumptions are made about its shape or the form of its dependence on x. Ha$ rdle (1990) and Fan and Gijbels (1996) provide detailed discussions of nonparametric estimation methods. One easily understood and frequently used method is called ‘kernel estimation.’ This method was used to produce the solid line in Fig. 1. To describe the kernel method simply, assume that X is a continuously distributed, scalar random variable. Let oYi, Xi : i l 1, …, nq be a random sample of n observations of (Y, X ). Let K be a probability density function that is bounded, continuous, and symmetrical about zero. For example, K may be the standard normal density function. Let ohnq to be a sequence of positive numbers that converges to 0 as n _. For each n l 1, 2, … and i l 1, …, n define the function wni(:) by wni(x) l
K [(xkXi)\hn]
n
(1)
K [(xkXi)\hn]
i="
Then the kernel nonparametric estimator of E(Y Q x) is n
Hn(x) l wni(x)Yi i="
n _, then Hn(x) E(Y Q x) with probability 1. Thus, if n is large, Hn(x) is likely to be very close to E(Y Q x). Ha$ rdle (1990) provides a detailed discussion of the statistical properties of kernel nonparametric estimators. Nonparametric estimation minimizes the risk of specification error, but the price of this flexibility can be high. One important reason for this is that the precision of a nonparametric estimator decreases rapidly as the number of continuously distributed components of X increases. This phenomenon is called the ‘curse of dimensionality.’ As a result of it, impracticably large samples are usually needed to obtain acceptable estimation precision if X is multidimensional, as it often is in social science applications. For example, a labor economist may want to estimate mean log wages conditional on years of work experience, years of education, and one or more indicators of skill levels, thus making the dimension of X at least 3. See Exploratory Data Analysis: Multiariate Approaches (Nonparametric Regression) for further discussion of the curse of dimensionality. Another problem is that nonparametric estimates can be difficult to display, communicate, and interpret when X is multidimensional. Nonparametric estimates do not have simple analytic forms, so displaying and interpreting them can be difficult. If X is one- or twodimensional, then the estimate of E(Y Q x) can be displayed graphically as in Fig. 1, but only reduceddimension projections can be displayed when X has three or more components. Many such displays and much skill in interpreting them can be needed to fully convey and comprehend the shape of the estimate of E(Y Q x). Another problem with nonparametric estimation is that it does not permit extrapolation. That is, it does not provide predictions of E(Y Q x) at points x that are outside of the support (or range) of the random variable X. This is a serious drawback in policy analysis and forecasting, where it is often important to predict what might happen under conditions that do not exist in the available data. Finally, in nonparametric estimation, it can be difficult to impose restrictions suggested by economic or other theory. Matzkin (1994) discusses this issue. Semiparametric methods permit greater estimation precision than do nonparametric methods when X is multidimensional. In addition, semiparametric estimates are easier to display and interpret than nonparametric ones and provide limited capabilities for extrapolation and imposing restrictions derived from economic or other theory models.
(2)
Hn(x) is a weighted average of the observed values of Y. Observations Yi for which Xi is close to x get higher weight than do observations for which Xi is far from x. It can be shown that if hn 0 and nhn\(log n) _ as
3. Semiparametric Models The term ‘semiparametric’ refers to models in which there is an unknown function in addition to an unknown finite dimensional parameter. For example, the binary response model P(Y l 1 Q x) l G( βhx) is 13893
Semiparametric Models semiparametric if the function G and the vector of coefficients β are both treated as unknown quantities. This section describes two semiparametric models of conditional mean functions that are important in applications. The section also describes a related class of models that have no unknown finite-dimensional parameters but, like semiparametric models, mitigate the disadvantages of fully nonparametric models. In addition to the estimation of conditional mean functions, semiparametric methods can be used to estimate conditional quantile and hazard functions, binary response models in which there is heteroskedasticity of unknown form, transformation models, and censored and truncated mean- and median- regression models, among others. Space does not permit discussion of these models here. Horowitz (1998) and Powell (1994) provide more comprehensive treatments in which these models are discussed. 3.1 Single Index Models In a semiparametric single index model, the conditional mean function has the form E(Y Q x) l G(βhx)
(3)
where β is an unknown constant vector and G is an unknown function. The quantity βhx is called an ‘index.’ The inferential problem is to estimate G and β from observations of (Y, X). G in (3.1) is analogous to a link function in a generalized linear model, except in Eqn. (3) G is unknown and must be estimated. Model (3) contains many widely used parametric models as special cases. For example, if G is the identity function, then Eqn. (3) is a linear model. If G is the cumulative normal or logistic distribution function, then Eqn. (3) is a binary probit or logit model. When G is unknown, Eqn. (3) provides a specification that is more flexible than a parametric model but retains many of the desirable features of parametric models, as will now be explained. One important property of single index models is that they avoid the curse of dimensionality. This is because the index βhx aggregates the dimensions of x, thereby achieving ‘dimension reduction.’ Consequently, the difference between the estimator of G and the true function can be made to converge to zero at the same rate that would be achieved if βhx were observable. Moreover, β can be estimated with the same rate of convergence that is achieved in a parametric model. Thus, in terms of the rates of convergence of estimators, a single index model is as accurate as a parametric model for estimating β and as accurate as a one-dimensional nonparametric model for estimating G. This dimension reduction feature of single index models gives them a considerable advantage over nonparametric methods in applications where X is multidimensional and the single index structure is plausible. 13894
A single-index model permits limited extrapolation. Specifically, it yields predictions of E(Y Q x) at values of x that are not in the support of X but are in the support of βhX. Of course, there is a price that must be paid for the ability to extrapolate. A single index model makes assumptions that are stronger than those of a nonparametric model. These assumptions are testable on the support of X but not outside of it. Thus, extrapolation (unavoidably) relies on untestable assumptions about the behavior of E(Y Q x) beyond the support of X. Before β and G can be estimated, restrictions must be imposed that insure their identification. That is, β and G must be uniquely determined by the population distribution of (Y, X ). Identification of single index models has been investigated by Ichimura (1993) and, for the special case of binary response models, Manski (1988). It is clear that β is not identified if G is a constant function or there is an exact linear relation among the components of X (perfect multicollinearity). In addition, (3.1) is observationally equivalent to the model E(Y Q X ) l G*(γjδβhx), where γ and δ0 are arbitrary and G* is defined by the relation G*(γjδν) l G(ν) for all ν in the support of βhX. Therefore, β and G are not identified unless restrictions are imposed that uniquely specify γ and δ. The restriction on γ is called ‘location normalization’ and can be imposed by requiring X to contain no constant (intercept) component. The restriction on δ is called ‘scale normalization.’ Scale normalization can be achieved by setting the β coefficient of one component of X equal to one. A further identification requirement is that X must include at least one continuously distributed component whose β coefficient is nonzero. Horowitz (1998) gives an example that illustrates the need for this requirement. Other more technical identification requirements are discussed by Ichimura (1993) and Manski (1988). The main estimation challenge in single index models is estimating β. Given an estimator bn of β, G can be estimated by carrying out the nonparametric regression of Y on bhnX (e.g., by using the kernel method described in Sect. 2). Several estimators of β are available. Ichimura (1993) describes a nonlinear least squares estimator. Klein and Spady (1993) describe a semiparametric maximum likelihood estimator for the case in which Y is binary. These estimators are difficult to compute because they require solving complicated nonlinear optimization problems. Powell et al. (1989) describe a ‘densityweighted average derivative estimator’ (DWADE) that is noniterative and easily computed. The DWADE applies when all components of X are continuous random variables. It is based on the relation β`E [ p(X )cG( βhX )\cX ] lk2E[Ycp(X)\cX ] (4) where p is the probability density function of X and the second equality follows from integrating the first by
Semiparametric Models parts. Thus, β can be estimated up to scale by estimating the expression on the right-hand side of the second equality. Powell et al. (1989) show that this can be done by replacing p with a nonparametric estimator and replacing the population expectation E with a sample average. Horowitz and Ha$ rdle (1996) extend this method to models in which some components of X are discrete. They also give an empirical example that illustrates the usefulness of single index models. Ichimura and Lee (1991) investigate a multiple index generalization of Eqn. (3).
3.2 Partially Linear Models In a partially linear model, X is partitioned into two nonoverlapping subvectors, X and X . The model " # has the form E(Y Q x , x ) l βhx jG(x ) " # " #
(5)
where β is an unknown constant vector and G is an unknown function. This model is distinct from the class of single index models. A single index model is not partially linear unless G is a linear function. Conversely, a partially linear model is a single index model only in this case. Stock (1989, 1991) and Engle et al. (1986) illustrate the use of Eqn. (5) in applications. Identification of β requires the ‘exclusion restriction’ that none of the components of X are perfectly predictable by components of X . When "β is identified, it can be estimated with an n−"/## rate of convergence regardless of the dimensions of X and X . Thus, the # curse of dimensionality is avoided" in estimating β. An estimator of β can be obtained by observing that Eqn. (5) implies YkE(Y Q x ) l βh[X kE(X Q x )]jU # " " #
(6)
where U is an unobserved random variable satisfying E(U Q x , x ) l 0. Robinson (1988) shows that under " #conditions β can be estimated by applying regularity OLS to Eqn. (6) after replacing E(Y Q x ) and E(X Q x ) # " b# , with nonparametric estimators. The estimator of β, n converges at rate n−"/# and is asymptotically normally distributed. G can be estimated by carrying out the nonparametric regression of YkbhnX on X . Unlike " the #curse of bn, the estimator of G suffers from dimensionality; its rate of convergence decreases as the dimension of X increases. #
& f (ν)w (ν) dν l 0, k
k
k l 1, …, d
(8)
where wk is a non-negative weight function. An additive model is distinct from a single index model unless E(Y Q x) is a linear function of x. Additive and partially linear models are distinct unless E(Y Q x) is partially linear and G in Eqn. (5) is additive. An estimator of fk(k l 1, …, d ) can be obtained by observing that Eqn. (7) and (8) imply
&
fk(xk) l E(Y Q x)w−k(x−k) dx−k
(9)
where x−k is the vector consisting of all components of x except the kth and w−k is a weight function that satisfies w−k(x−k) dx−k l 1. The estimator of fk is obtained by replacing E(Y Q x) on the right-hand side of Eqn. (9) with nonparametric estimators. Linton and Nielsen (1995) and Linton (1997) present the details of the procedure and extensions of it. Under suitable conditions, the estimator of fk converges to the true fk at rate n−#/& regardless of the dimension of X. Thus, the additive model provides dimension reduction. It also permits extrapolation of E(Y Q x) within the rectangle formed by the supports of the individual components of X. Hastie and Tibshirani (1990) and Exploratory Data Analysis: Multiariate Approaches (Nonparametric Regression) discuss an alternative estimation procedure called ‘backfitting.’ This procedure is widely used, but its asymptotic properties are not yet well understood. Linton and Ha$ rdle (1996) describe a generalized additive model whose form is E(Y Q x) l G[ µjf (x )j(jfK(xd)] " "
(10)
where f , …, fd are unknown functions and G is a known, "strictly increasing (or decreasing) function. Horowitz (2001) describes a version of Eqn. (10) in which G is unknown. Both forms of Eqn. (10) achieve dimension reduction. When G is unknown, Eqn. (10) nests additive and single index models and, under certain conditions, partially linear models. The use of the nonparametric additive specification (7) can be illustrated by estimating the model E(log W Q EXP, EDUC)
3.3 Nonparametric Additie Models Let X have d continuously distributed components that are denoted X , … Xd. In a nonparametric " additive model of the conditional mean function, E(Y Q x) l µjf (x )j(jfd(xd) " "
where µ is a constant and f , …, fd are unknown " normalization confunctions that satisfy a location dition such as
(7)
l µjfEXP(EXP)jfEDUC(EDUC) where W and EXP are defined as in Sect. 1, and EDUC denotes years of education. The data are taken from the 1993 CPS and are for white males with 14 or fewer 13895
Semiparametric Models
Figure 2 (a) Estimate of fEXP in additive nonparametric model of E(log W Q EXP, EDUC). (b) Estimate of fEDUC in additive nonparametric model of E(log W Q EXP, EDUC)
years of education who work full time and live in urban areas of the North Central US. The results are shown in Fig. 2. The unknown functions fEXP and fEDUC are estimated by the method of Linton and Nielsen (1995) and are normalized so that fEXP(2) l fEDUC(5) l 0. The estimates of fEXP (Fig. 2a) and fEDUC (Fig. 2b) are nonlinear and differently shaped. Functions fEXP and fEDUC with different shapes cannot be produced by a single index model, and a lengthy specification search might be needed to find a parametric model that produces the shapes shown in Fig. 2. Some of the fluctuations of the estimates of fEXP and fEDUC may be artifacts of random sampling error rather 13896
than features of E(log W Q EXP, EDUC). However, a more elaborate analysis that takes account of the effects of random sampling error rejects the hypothesis that either function is linear.
4. Conclusions This article has described several semiparametric methods for estimating conditional mean functions. These methods relax the restrictive assumptions made by linear and other parametric models, thereby reducing (though not eliminating) the likelihood of
Senescence: Genetic Theories of seriously misleading inference. The value of semiparametric methods in empirical research has been demonstrated. Their use is likely to increase as their availability in commercial statistical software packages increases.
Bibliography Amemiya T 1985 Adanced Econometrics. Harvard University Press, Cambridge, MA Engle R F, Granger C W J, Rice J, Weiss A 1986 Semiparametric estimates of the relationship between weather and electricity sales. Journal of the American Statistical Association 81: 310–20 Fan J, Gijbels I 1996 Local Polynomial Modelling and its Applications. Chapman & Hall, London Goldberger A S 1998 Introductory Econometrics. Harvard University Press, Cambridge, MA Ha$ rdle W 1990 Applied Nonparametric Regression. Cambridge University Press, Cambridge, UK Hastie T J, Tibshirani R J 1990 Generalized Additie Models. Chapman & Hall, London Horowitz J L 1993 Semiparametric and nonparametric estimation of quantal response models. In: Maddala G S, Rao C R, Vinod H D (eds.) Handbook of Statistics. Elsevier, Amsterdam, Vol. II, pp. 45–72 Horowitz J L 1998 Semiparametric Methods in Econometrics. Springer-Verlag, New York Horowitz J L 2001 Nonparametric estimation of a generalized additive model with an unknown link function. Econometrica 69: 599–631 Horowitz J L, Ha$ rdle W 1996 Direct semiparametric estimation of single-index models with discrete covariates. Journal of the American Statistical Association 91: 1632–40 Ichimura H 1993 Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. Journal of Econometrics 58: 71–120 Ichimura H, Lee L-F 1991 Semiparametric least squares estimation of multiple index models: single equation estimation. In: Barnett W A, Powell J, Tauchen G (eds.) Nonparametric and Semiparametric Methods in Econometrics and Statistics. Cambridge University Press, Cambridge, UK, pp. 3–49 Klein R W, Spady R H 1993 An efficient semiparametric estimator for binary response models. Econometrica 61: 387–421 Linton O B 1997 Efficient estimation of additive nonparametric regression models. Biometrika 84: 469–73 Linton O B, Ha$ rdle W 1996 Estimating additive regression models with known links. Biometrika 83: 529–40 Linton O B, Nielsen J P 1995 A kernel method of estimating structured nonparametric regression based on marginal integration. Biometrika 82: 93–100 Manski C F 1988 Identification of binary response models. Journal of the American Statistical Association 83: 729–38 Matzkin R L 1994 Restrictions of economic theory in nonparametric methods. In: Engle R F, McFadden D L (eds.) Handbook of Econometrics. North-Holland, Amsterdam, Vol. 4, pp. 2523–58 Powell J L 1994 Estimation of semiparametric models. In: Engle R F, McFadden D L (eds.) Handbook of Econometrics. NorthHolland, Amsterdam, Vol. 4, pp. 2444–521
Powell J L, Stock J H, Stoker T M 1989 Semiparametric estimation of index coefficients. Econometrica 51: 1403–30 Robinson, P M 1988 Root-N-consistent semiparametric regression. Econometrica 56: 931–54 Stock J H 1989 Nonparametric policy analysis. Journal of the American Statistical Association 84: 567–75 Stock J H 1991 Nonparametric policy analysis: An application to estimating hazardous waste cleanup benefits. In: Barnett W A, Powell J, Tauchen G (eds.) Nonparametric and Semiparametric Methods in Econometrics and Statistics. Cambridge University Press, Cambridge, UK, pp. 77–98
J. L. Horowitz
Senescence: Genetic Theories Senescence is the progressive deterioration of vitality that accompanies increasing age. Like other features of organismal life histories, patterns of senescence vary between individuals within populations, between populations of the same species, and between species, suggesting that they are modifiable by genetic factors and subject to evolutionary change. In this article, the various evolutionary forces that might direct genetic modifications of senescence are considered, and a theoretical framework for understanding the evolution of life histories is presented. The secondary problem of the maintenance of genetic variation for life history traits is also reviewed.
1. Medawar’s Principle The modern evolutionary theory of senescence begins with Medawar who argued that ‘… the efficacy of natural selection deteriorates with increasing age’ (1952, p. 23). A simple hypothetical example similar to a case considered by Hamilton (1966) illustrates the principle. Consider two genetic variants in humans, both having age-specific effects as follows: each variant confers complete immunity against a lethal disease, but only for one particular year of life. The first variant gives immunity to 12 year-olds, while the second variant confers immunity at the age of 60 years. What are the relative selective advantages of the genetic variants? If, for simplicity, the effects of parental care are ignored and it is also assumed that menopause always comes before 60 years, then it is immediately obvious that the second variant is selectively neutral, having no effect at all on the ability of carriers to transmit genes to the next generation, whereas the first variant has a significant selective advantage. This example illustrates the general principle that natural selection is most effective in the 13897
Senescence: Genetic Theories of young. To obtain a more exact and quantitative understanding of the relation between organismal age and the force of selection, it is necessary to develop a description of selection in age-structured populations.
2. Age-structured Populations Some organisms, such as annual plants, complete their life cycles in discrete fashion, exhibiting no overlap of parental and offspring generations. However, most higher organisms, including humans, have overlapping generations. Under the latter circumstances, the description of population composition and growth requires two kinds of information: age-specific survival, and age-specific fertility. Survivorship, denoted l(x), is defined as the probability of survival from birth or hatching until age x. A survivorship curve is a graph of l(x) versus x, where x ranges from zero to the greatest age attained in the population. The survivorship is initially 100 percent at birth and then declines to zero at the maximum observed age; it cannot increase with increasing age. If a cohort of 1,000 age-synchronized individuals are followed throughout their lives, then 500 of them will be alive at the age when l(x) is 0.50, 100 will be alive when l(x) is 0.10, and so on. Age-specific fertility, represented as m(x), is defined as the average number of progeny produced by a female of age x. One of the fundamentals of demography is that, under a wide range of conditions, a population having fixed l(x) and m(x) schedules will eventually attain a stable age-structure. That is, after a period of time the proportions of the population in each age-class will reach unchanging values. If the survivorship or fertility schedules are altered, then a different age-structure will evolve. Prior to attaining the stable age distribution, population growth is likely to be erratic, but once the stable age distribution is reached, then, under the assumption of unlimited resources, the population will grow smoothly. In particular, the population will exhibit exponential growth described by the following equation: N (t) l N (0) ert
(1)
where N(t) is population size as a function of time t, N(0) is initial population size, e is the natural exponential, and r is the Malthusian parameter, also known as the intrinsic rate of increase of the population. The parameter r combines the effects of age-specific survival and fertility and translates them into a population growth rate. The value of r is the implicit solution to the following equation, known as the Euler–Lotka equation:
&e
−rx l (x)
m (x) dx l 1
(2)
The significance of the Malthusian parameter is that it reflects fitness, in the Darwinian sense, in an age13898
structured population. If there are several genotypes in a population, and if those genotypes differ with respect to age-specific survival or fertility patterns, then each genotype will have a particular r value. Those r’s specify the rate at which a population consisting of only that genotype would grow, once the stable age distribution has been attained. The r’s also specify the relative fitnesses of the genotypes in a genotypically mixed population. The genotype with the highest r has the highest fitness and will be favored by natural selection under conditions that allow population growth. Much of the evolutionary theory relating to senescence and life histories uses the Malthusian parameter r as a surrogate for Darwinian fitness, essentially asking what changes in the l(x) and m(x) schedules would maximize the intrinsic rate of increase. There is one other quantity that arises as a measure of fitness in populations with overlapping generations. Fisher (1930) defined ‘reproductive value,’ which is the expected number of progeny that will be produced by an individual of age x over the rest of its lifetime, given that it has survived to age x. Reproductive value is not the same as fitness, because it does not take into account the chances of surviving to age x.
3. Hamilton’s Perturbation Analysis Hamilton (1966) asked the following question: What sorts of small genetic changes in the l(x) or m(x) schedules will be favored by natural selection? To answer this question he employed the Malthusian parameter r as a measure of fitness, assuming that the modifications of l(x) and m(x) that lead to the highest value of r will be the ones to evolve. He also approximated the continuous functions described above with their discrete-time counterparts. The discrete-time rate of population increase is: λ l er
(3)
The discrete-time version of the Euler–Lotka equation is lx mx λ−x l 1
(4)
Age-specific survival is expressed in descrete time as: lx l p1 p2 p3 … px
(5)
where px is the probability of surviving the duration of the xth age class given that one has survived to the beginning of age class x. Now consider the evolutionary fate of a mutation which causes a small change in the ability to survive at some particular age a. The new mutation will be favored by natural selection if it causes an increase in r, or, what is equivalent in the discrete-time case, an increase in ln λ. The effect of the perturbation is studied by examining the partial derivative of λ with
Senescence: Genetic Theories of respect to pa. Hamilton obtained a closed form of this derivative and was able to conclude the following: (a) The force of selection, as indicated by the partial derivative is highest at the youngest pre-reproductive ages, begins to decline when reproduction commences and drops to zero when reproduction ceases. (b) If a mutation causes a gain in survival at a particular age a and an equal loss in survival at age a , " # then such a mutation will increase in the population only if a a. # (c) If "a mutation causes a gain in fertility at a particular age a and an equal loss of fertility at age a , " # then such a mutation will increase in the population only if a a. # (d ) If" a mutation causes a loss in survival at a particular age and an increase in fertility at that same age, then the limits of the loss in survival that can be tolerated are set by the inverse of the reproductive value. That is, if the reproductive value at age x is large, then only a small reduction of survival can be exchanged for a gain in fertility, but if the remaining reproductive value is small, then a large reduction in survival can evolve. (For further explication of these results, see Roughgarden 1996, p. 363.) Hamilton’s general conclusion is that ‘for organisms that reproduce repeatedly senescence is to be expected as an inevitable consequence of the working of natural selection’ (1966, p. 26). This is a view that is clearly consistent with Medawar (1952). For a technical discussion of the validity of the assumption that the Malthusian parameter is equivalent to fitness in agestructured populations, see Charlesworth (1980), whose models extend the early results of Haldane (1927) and Norton (1928).
4. Pleiotropy Pleiotropy means that a single gene affects two or more characters. In the context of life history evolution, pleiotropy means that a single gene affects the fitness of the organism at two or more ages. It is convenient to categorize the combinations of agespecific pleiotropic effects as shown in Table 1. If a new mutation improves fitness in both young and old animals, then it is likely to be favored by natural Table 1 Pleiotropic effects of mutations affecting life history Effect on fitness in the young
Effect on fitness in the old
Evolutionary fate
j k k k j
j k k j k
Increase Decrease Decrease Decrease Increase
Table 2 Fitness parameters for a one-locus model of antagonistic pleiotropy Genotype:
A1A1
A1A2
A2A2
Fitness in young: Fitness in old:
High Low
Medium Medium
Low High
selection, and will increase in the population. Conversely, a gene that decreases fitness in both young and old organisms will be eliminated by natural selection. The more interesting cases in Table 1 are those in which the fitness effects on young and old organisms are negatively correlated, a condition referred to as ‘negative pleiotropy’ or ‘antagonistic pleiotropy.’ Medawar’s principle suggests that mutations that improve early fitness at the expense of late fitness will be favored by natural selection, while those with the converse effects will be eliminated. The possibility that genes might increase fitness at one age and also decrease it at another was mentioned by early theorists, but the first strong advocate of this mechanism of the evolution of senescence was Williams (1957), who noted that natural selection will tend to maximize vigor in the young at the expense of vigor later in life. An example of negatively pleiotropic gene action of the sort that Williams proposed is shown in Table 2. Williams argued that, in the course of selecting for the allele A1 which is beneficial at young ages, the deleterious effects of allele A2 on the old are brought along; in this scenario, senescence evolves as an incidental consequence of adaptation at earlier ages. The exact mathematical conditions for the increase of antagonistic, pleiotropic mutations have been derived (Charlesworth 1980), verifying that such mutations can indeed increase in populations. While the theoretical basis for antagonistic pleiotropy is sound and widely accepted, it is unclear whether there exists the special sort of genetic variation that this mechanism requires. While it is easy to imagine physiological situations in which there could occur trade-offs between the fitness of the young and the old, there are few, if any, actual cases of such variation identified (Finch 1990, p. 37), even though a half century has passed since Williams’ proposal. Negative correlations between life history characters are sometimes construed as evidence for pleiotropy, but this interpretation overlooks the fact that phenotypic correlations arise from factors other than pleiotropy, including the correlation of environmental factors and the correlation of alleles at genetically linked loci (linkage disequilibrium). What is required for the antagonistic pleiotropy model is not just evidence of trade-offs in life history traits, which is abundant, but a demonstration that there exist tradeoffs in life history characters that are mediated by alternative alleles at specific polymorphic loci. Until such genes are characterized and shown to play a role 13899
Senescence: Genetic Theories of in life history evolution, the antagonistic pleiotropy model will remain an interesting theoretical construct, but one of unknown, and possibly negligible, biological significance.
contrast to the situation with antagonistic pleiotropy, there is experimental evidence for the kinds of genetic variation that the model requires, namely spontaneous mutations with age-specific effects on vital rates (Mueller 1987, Pletcher et al. 1998, 1999).
5. Mutation Accumulation Germ-line mutations, which are changes in the DNA sequence in sperm and egg cells, occur at low but nonzero rates, largely as a result of proof-reading errors in the enzymes that replicate DNA. This slow, steady input of genetic variants has the potential to corrupt the gene pool, since almost all of the novel variants that have some effect are deleterious. However, natural selection works against the corrupting effect by removing carriers of deleterious mutations. A balance is reached between the steady input of deleterious genes through mutation and their removal by natural selection. One of the characteristics of the equilibrium balance state is that, for any particular gene, the deleterious alleles are present at low frequencies, usually much less than 1 percent. The low rate of occurrence of each of many hereditary human diseases is thought to reflect the mutation-selection balance operating at many genes, each of which is capable of being mutated to produce a deleterious condition (Hartl and Clark 1997). The classical mutation-selection balance model is appropriate for mutations that have deleterious effects early in life, but what happens when the disability is expressed only late in life? Medawar (1952) suggested that natural selection will be unable to counteract the feeble pressure of repetitive mutation if the mutant genes make their effects known at advanced ages, either post-reproductively or at ages not attained by most of the members of the species. This follows naturally from his proposal that the force of selection declines with increasing age. Under such conditions, the deleterious mutations would gradually accumulate, unchecked by natural selection. In this view, senescence is a process driven entirely by mutation. This mechanism for the evolution of senescence is distinct from, but not mutually exclusive of, antagonistic pleiotropy. While the pleiotropy process suggests that senescence is the incidental consequence of adaptation, the mutation accumulation model invokes deterioration without adaptation (Partridge and Barton 1993). Charlesworth (1980) has analyzed a deterministic model of an age-structured population with recurrent mutation. He derived an approximation for the frequency of heterozygous carriers of deleterious alleles and found that the equilibrium frequency is inversely proportional to the selection intensity. The significance of this result is that when there is only very weak selection pressure, as at advanced ages, then mutant alleles can attain high frequencies under the influence of recurrent mutation. This result verifies the earlier conjectures of Medawar and Williams. In 13900
6. Postponement of Deleterious Effects Medawar (1952) also considered the case of Huntington’s chorea, a grave and ultimately fatal nervous disorder that usually manifests itself in middle-aged patients. He suggested that there could be selection in favor of genetic modifiers which have as their main effect the postponement of the effects of the Huntington’s gene or other genes causing hereditary disorders. This suggestion, which very much resembles an earlier proposal of R. A. Fisher concerning the evolution of dominance, is unlikely to be correct. While it makes sense that there would be some selection in favor of delaying the mutant effect, Charlesworth (1980, p. 219) has shown that the selection pressure exerted on such a hypothetical modifier gene would be exceedingly small, on the order of the mutation rate. This is because the modifier has an effect on fitness only when it co-occurs with the Huntington’s or other disease gene, which is at mutation-selection balance and present in only a small fraction of the population. Under such conditions the evolutionary fate of the modifier is likely to be determined by genetic drift or other stochastic factors rather than the minuscule selective pressure.
7. The Variation Problem While the primary concern of theorists has been to explain the degeneration of vitality associated with old age, there is a secondary problem that can also be addressed with these models. Genetic variation is the raw material upon which natural selection operates to produce adaptations and new species. The mechanisms by which variation is maintained in populations are therefore of considerable interest to evolutionary geneticists. To what extent do genetic models of senescence tend to maintain variation in life histories within populations? Several authors have addressed the question and come to two different answers depending upon the theoretical construct employed. Curtsinger et al. (1994) analyzed deterministic one- and two-locus models of antagonistic pleiotropy and asked under what conditions polymorphisms would be maintained. The conditions for stable polymorphism were found to be rather restrictive, especially with weak selection. The conditions were also found to be very sensitive to dominance parameters; in particular, reversal of dominance properties with respect to the two traits is often required for polymorphism, but seems improbable on biochemical grounds.
Senescence: Genetic Theories of Tuljapurkar (1997) gives an overview of modeling strategies and describes some of his own models in which mortality is assumed to depend on both organismal age and random variables in the environment. In these models, the relative fitnesses are measured by a stochastic growth rate, which reflects average vital rates and environmental variability. Results from several related models show that phenotypic combinations that differ in age-specific fertility can be equally fit in a range of stochastic environments. The paper concludes that polymorphisms for length of reproductive life can be readily maintained by selection in temporally varying environments.
8. Future Directions Two important challenges to the genetical theories of senescence arise from recent experimental work. The first challenge concerns mortality rates at advanced ages. Observations of survival in experimental organisms are usually presented in terms of age-specific survivorship, as defined in Sect. 2, but if sample sizes are sufficiently large then the survival data can also be analyzed in terms of hazard functions, which define the instantaneous risk of death as a function of age. Unlike survivorship, the hazard function can be nonmonotonic. Many experimental studies of moderate sample size have documented that the hazard increases approximately exponentially with age, a dynamic generally referred to as the Gompertz law (Finch 1990). Recent experiments have been done on an unusually large scale, making it possible to estimate hazards at very advanced ages. For Drosophila, nematode worms, and Med-flies, hazard functions increase exponentially in the early part of the life history, as expected, but at the most advanced ages the hazard functions decelerate, bending over and producing unexpected ‘mortality plateaus’ (see Vaupel et al. 1998, for a review of the experimental evidence and data on human populations). The existence of mortality plateaus at advanced, post-reproductive ages poses a challenge for mutation accumulation models, which predict, under a wide range of assumption, a ‘wall’ of high mortality at the age when reproduction is completed. A preliminary attempt to accommodate mortality plateaus into antagonistic pleiotropy models has failed (Pletcher and Curtsinger 1998, Wachter 1999). One possible solution is that the mortality plateaus are caused by population heterogeneity of both genetic and nongenetic origin (Pletcher and Curtsinger 2000, Service 2000). It has so far proved very difficult to measure the relevant heterogeneity and determine whether it is of sufficient magnitude to produce the plateaus. A second possibility that can explain some features of the plateaus is a model of positive pleiotropy, which causes late-life mortality rates to avoid inflation because of the positively correlated effects of alleles
selected for early survival (Pletcher and Curtsinger 1998). Models of positive pleiotropy merit further investigation. The second experimental challenge to current theory concerns genetic variance. The mutation accumulation model predicts that genetic variance for mortality should increase at advanced ages. Recent experiments document instead a decline of genetic variance at advanced ages in experimental populations of Drosophila (Promislow et al. 1996). Hughes and Charlesworth (1994) initially reported that genetic variance for mortality increases with age in their Drosophila populations, but a re-analysis of the data show close concordance with the Promislow result (Shaw et al. 1999). At present no one knows why genetic variance declines at advanced ages; it could be related to the mortality plateaus described above. Three other lines of research appear to hold promise. Group selection is a process by which collections of organisms succeed or fail as a collective, either becoming extinct or spawning new groups in competition with other groups. Groups selection arguments are sometimes invoked to explain altruistic behaviors, but evolutionary biologists typically disdain group selection, because the process tends to be much weaker than selection between individuals, and also seems to work only in very small populations ( Williams 1966). However, group selection might play a role in shaping post-reproductive mortality rates, when individual-level selection is essentially inoperative. This type of model could be particularly relevant to human evolution, and is essentially unexplored. A related type of model that needs further development involves the evolution of vital rates in combination with kin selection, taking into account the effects of post-reproductive survival, parental care, and fitness effects mediated through relatives in the kin group (Roach 1992). Finally, as noted by Tuljapurkar (1997), the theoretical methods are limited to small perturbations and local analyses, under which conditions the population is always close to demographic equilibrium. There does not exist at present a theory that can accommodate large mutational changes in vital rates in combination with non-equilibrium demographics. See also: Aging and Health in Old Age; Aging, Theories of; Brain Aging (Normal): Behavioral, Cognitive, and Personality Consequences; Differential Aging; Lifespan Development, Theory of; Old Age and Centenarians; Spatial Memory Loss of Normal Aging: Animal Models and Neural Mechanisms
Bibliography Charlesworth B 1980 Eolution in Age-structured Populations. Cambridge University Press, Cambridge, UK
13901
Senescence: Genetic Theories of Curtsinger J W, Service P, Prout T 1994 Antagonistic pleiotropy, reversal of dominance, and genetic polymorphism. American Naturalist 144: 210–28 Finch C E 1990 Longeity, Senescence, and the Genome. University of Chicago Press, Chicago Fisher R A 1930 The Genetical Theory of Natural Selection. Clarendon Press, Oxford, UK Haldane J B S 1927 A mathematical theory of natural and artificial selection, Part IV. Mathematical Proceedings of the Cambridge Philosophical Society 32: 607–15 Hamilton W D 1966 The moulding of senescence by natural selection. Journal of Theoretical Biology 12: 12–45 Hartl D L, Clark A G 1997 Principles of Population Genetics, 3rd edn. Sinauer Associates Inc., Sunderland, MD Hughes K A, Charlesworth B 1994 A genetic analysis of senescence in Drosophila. Nature 367: 64–6 Medawar P B 1952 An Unsoled Problem of Biology. H K Lewis, London Mueller L D 1987 Evolution of accelerated senescence in laboratory populations of Drosophila. Proceedings of the National Academy of Sciences of the USA 84: 1974–77 Norton H T J 1928 Natural selection and Mendelian variation. Proceedings of the London Mathematical Society 28: 1–45 Partridge L, Barton N H 1993 Optimality, mutation, and the evolution of ageing. Nature 362: 305–11 Pletcher S D, Curtsinger J W 1998 Mortality plateaus and the evolution of senescence: Why are old-age mortality rates so low? Eolution 52: 454–64 Pletcher S D, Curtsinger J W 2000 The influence of environmentally induced heterogeneity on age-specific genetic variance for mortality rates. Genetical Research Cambridge 75: 321–9 Pletcher S D, Houle D, Curtsinger J W 1998 Age-specific properties of spontaneous mutations affecting mortality in Drosophila melanogaster. Genetics 148: 287–303 Pletcher S D, Houle D, Curtsinger J W 1999 The evolution of age-specific mortality rates in Drosophila melanogaster: Genetic divergence among unselected lines. Genetics 153: 813–23 Promislow D E L, Tatar M, Khazaeli A A, Curtsinger J W 1996 Age-specific patterns of genetic variance in Drosophila melanogaster. I. Mortality. Genetics 143: 839–48 Roach D A 1992 Parental care and the allocation of resources across generations. Eolutionary Ecology 6: 187–97 Roughgarden J 1996 Theory of Population Genetics and Eolutionary Ecology. An Introduction. Prentice Hall, Upper Saddle River, NJ Service P M 2000 Heterogeneity in individual mortality risk and its importance for evolutionary studies of senescence. American Naturalist. 156: 1–13 Shaw F, Promislow D E L, Tatar M, Hughes L, Geyer C 1999 Towards reconciling inferences concerning genetic variation in senescence in Drosophila melanogaster. Genetics 152: 553–66 Tuljapurkar S 1997 The evolution of senescence. In: Wachter K W, Finch C E (eds.) Between Zeus and the Salmon. National Academy Press, Washington, DC, pp. 65–77 Vaupel J W, Carey J R, Christensen K, Johnson T E, Yashin A I, Holm N V, Iachine I A, Kanisto V, Khazaeli A A, Liedo P, Longo V D, Zeng Y, Manton K G, Curtsinger J W 1998 Biodemographic trajectories of longevity. Science 280: 855–60 Wachter K W 1999 Evolutionary demographic models for mortality plateaus. Proceedings of the National Academy of Sciences of the USA 96: 10544–7 Williams G C 1957 Pleiotropy, natural selection, and the evolution of senescence. Eolution 11: 398–411
13902
Williams G C 1966 Adaptation and Natural Selection, A Critique of Some Current Eolutionary Thought. Princeton University Press, Princeton, NJ
J. W. Curtsinger
Sensation and Perception: Direct Scaling One aim of psychophysics is to measure the subjective intensity of sensation. We easily hear that a tone of given intensity has a particular loudness, and that a tone of higher intensity sounds louder still. The measurement problem lies in the question ‘How much louder?’ Direct scaling is a particular way of answering that question: the observer is asked to assign numbers corresponding to the subjective magnitudes of given stimulus intensities, thus providing the capability of saying that, for example, one tone sounds twice or ten times as loud as another.
1. History 1.1 Fechner’s Solution G. Fechner first proposed a solution, albeit an indirect one, in 1860 (Boring 1950). He accepted Weber’s law—that the difference threshold (the physical size of a difference in intensity needed to tell one signal from another) grows in proportion to signal intensity, determining a constant of proportionality, the Weber fraction, unique to each sensory continuum. Fechner assumed that all difference thresholds are subjectively constant and therefore can serve as the unit of measurement for scales of sensation. He concluded that subjective magnitude grows by constant differences as stimulus magnitude grows by constant ratios, a logarithmic relation now known as Fechner’s law. 1.2 Category Rating and Magnitude Scaling In Fechner’s laboratory, there also evolved a direct approach to measuring sensation magnitude. This was the method of absolute judgment (later called category rating), in which the subject assigns stimuli to categories according to some subjective aspect. Originally designed to study esthetic judgment, it was adapted by 1920 to studying sensation magnitude; early findings using this method supported Fechner’s law. In the 1950s, S. S. Stevens refined another and related direct scaling procedure in which the subject estimates the apparent magnitude of a signal by assigning to it a numerical value—hence, magnitude estimation. Data from this method were better fitted by a power law.
Sensation and Perception: Direct Scaling
2. Methods All the direct scaling methods require an observer to judge a series of stimuli varying along (at least) one dimension. For instance, the experimenter might present a pure tone at several sound pressure levels covering a 1000:1 range and request judgments of loudness, or present a light disk at several luminanoes covering a 10:1 range and request judgments of brightness.
2.1 Category Rating The observer is instructed to assign each stimulus to a category, which may be designated descriptively (very soft, soft, … very loud) or numerically (1, 2, … , 7); the assignments should create subjectively equal category intervals. The allowable response values, including the extreme values, are specified by the experimenter. Usually the number of categories ranges from 5 to 11, although smaller and larger values are sometimes used. Typically, the stimulus range does not exceed 10:1, although it may do so.
2.2 Magnitude Estimation and Production In magnitude estimation, the observer is instructed to assign to each stimulus a (positive) number so that the number reflects its apparent intensity. There are no limits on allowable response values. Originally, one stimulus value was designated as standard and a numerical value assigned to it as modulus. Later, however, the use of designated standard and assigned modulus was abandoned and the observer instructed to choose any numbers appropriate to represent apparent magnitudes. Typically the stimulus range is at least 100:1 and is often greater. In magnitude production, stimulus and judgmental continua are interchanged; the experimenter presents a set of numbers, and the observer produces values on some intensive continuum to effect a subjective match for each presented number.
3. Achieements of Direct Scaling 3.1 A Quantitatie Phenomenology These methods made possible a quantitative phenomenology of the various sensory systems. For example, in hearing, the importance of salient variables, such as frequency and intensity, to the ability of the observer to detect a sound or to tell the difference between two sounds, had long been known. Furthermore, by having the observer match tones of varying frequency and intensity to a fixed reference tone, equal loudness contours were established. What remained unknown were the loudness relations among these contours. Magnitude estimation, by providing a numerical reference scale, permits the observation that one tone sounds twice, or ten times, or half as loud as another. Thus, magnitude estimation has been used to map the effects of variables known to influence perception for loudness, brightness, tastes and smells, and tactile sensations, as well as internal states such as perceived effort, fatigue, and satiety (Stevens 1975). It has also been used to study continua without a physical measure (e.g., seriousness of crime, severity of punishment, and their relationship) (Sellin and Wolfgang 1964). Category rating has been employed to study the cognitive integration of perceptual dimensions (Anderson 1981).
3.2 Intermodal Comparisons
2.3 Cross-modal Matching
Direct scaling also provides the capability to compare and contrast phenomena that occur in several modalities, such as temporal and spatial summation, sensory adaptation and recovery, and spatial inhibition (brightness contrast and auditory masking), at levels above absolute threshold (Marks 1974). For example, a persisting olfactory stimulus of constant intensity produces a smell intensity that diminishes in strength over time. That diminution can be assessed by asking an observer to assign numbers to the apparent magnitude of the smell at different elapsed times, e.g., after 5 s, 10 s, 30 s, 60 s, … ; in this way, the course of sensory adaptation can be traced for this stimulus and for others of differing initial intensities. When the procedure is repeated in a different modality, the parameters of the adaptation curves can be compared.
The use of numbers, which has provoked many objections, can be avoided by instructing the observer to match intensities on one physical continuum directly to intensities on another. For example, the observer may produce luminances to match sound pressure levels so that brightness of a light disk is equivalent to loudness of a tone, or the observer may produce handgrip forces to match odor intensities so that perceived effort is equivalent to strength of smell.
The third major achievement of direct scaling is the discovery that, to at least a first approximation, equal stimulus ratios produce equal judgmental ratios. This nearly universal relation is called Steven’s power law, after S. S. Stevens who established it: subjective
3.3 Steens’ Power Law
13903
Sensation and Perception: Direct Scaling magnitude is proportional to intensity raised to some power. Further, that exponent takes distinctive values for each stimulus continuum, ranging from 0.3 for luminance to 2.0 or more for electric shock. Later experiments showed that the magnitude exponents predicted cross-modal matching exponents: if continua A and B had magnitude exponents a and b, and if a cross-modal match of B to A was obtained, the resulting exponent was a\b. Indeed, the cross-modal exponents for a variety of continua are connected in a transitive network (Daning 1983), a finding with important theoretical implications.
4. Problems of Direct Scaling 4.1 Use of Numbers as Responses Critics questioned treating the numbers that subjects assigned to stimulus magnitudes as if they were measurements. However, the practice was validated by the discovery that the results obtained with number as the matching continuum are in agreement with the results obtained with cross-modal matching (Stevens, 1975). Furthermore, in cross-modal matching, some other stimulus continuum can substitute for number, since the choice of a reference continuum is, for most purposes, arbitrary. Whether the numbers used by an observer constitute a direct measure of sensation is an unresolved, and perhaps unresolvable, question (Teghtsoonian 1974). 4.2 The Psychophysical Regression Effect Since magnitude estimation (assigning numbers to physical intensities) and magnitude production (assigning physical intensities to match numbers) are inverses of each other, they should produce the same exponent. However, this is not the case. The size of the difference in exponents may be quite large, with the result that a precise value of the exponent for a given stimulus continuum cannot be specified; no satisfactory combination rule has been agreed upon. The size, and indeed the direction, of the difference depends on the range of physical intensities (or numbers) presented; for practical purposes, the smallest effect is exhibited when the range is large. Some portion, at least, of this regression effect depends on sequential effects, the influence exerted by previous stimuli and judgments on subsequent judgments. 4.3 Category s. Ratio Scales Although both category rating and magnitude estimation purport to give scales of sensation magnitude, they do not, in general, agree with each other: both scales can be characterized as power functions, but the exponents are not the same. Much discussion has 13904
centred on whether subjects can make two kinds of judgments or whether the experimenter transforms a single kind of magnitude judgment by treating it as ratio or interval. However, an experimental determination of the sources of variance in the type of scale produced shows that instructions to judge either ratios or intervals account for almost none of the variance; much more important are such methodological variables as whether judgmental end-points are assigned or free and whether the range of stimuli is small or large. With free end-points and a large range, category judgment and magnitude estimation produce the same results; the obtained power functions have stable exponents characteristic of magnitude estimation (Montgomery 1975). 4.4 Local s. Global Psychophysics A long-standing conundrum in psychophysics has been the relation between the local (thresholds, both absolute and difference) and the global (measurements of subjective magnitude at levels above absolute threshold) (Luce and Krurnhansl (1988). Fechner believed he had combined the two by using difference thresholds as subjective units to yield a logarithmic scale of sensation. Stevens (1961) proposed, while honoring Fechner, to repeal his law and to substitute a power scale of sensation; he believed that one could not derive measures of sensation magnitude from threshold determinations. An early attempt to reconcile these two positions was R. Teghtsoonian’s argument (1973) that both difference thresholds and power law exponents are indices of dynamic range. He proposed that there is a common scale of sensory magnitude for all perceptual continua and that the observer’s dynamic range for each continuum maps completely onto that common scale. For example, the least sound intensity experienced has the same subjective magnitude as the least luminance, and the greatest sound intensity to which the auditory system responds has the same subjective magnitude as the greatest luminance. Thus the mapping of widely divergent dynamic ranges for the several perceptual continua onto a single subjective magnitude range determines their power law exponents, and the mapping of widely divergent difference thresholds onto a single subjective difference threshold determines their Weber fractions. See also: Fechnerian Psychophysics; Memory Psychophysics; Psychophysical Theory and Laws, History of; Psychophysics; Scaling: Correspondence Analysis; Stevens, Stanley Smith (1906–73).
Bibliography Anderson N H 1981 Foundations of Information Integration Theory. Academic Press, New York
Sensation Seeking: Behaioral Expressions and Biosocial Bases Boring E G 1950 A History of Experimental Psychology, 2nd edn. Appleton-Century-Crofts, New York Daning R 1983 Intraindividual consistencies in cross-modal matching across several continua. Perception and Psychophysics 33: 516–22 Luce R D, Krumhansl C L 1988 Measurement, scaling, and psychophysics. In: Atkinson R C, Hernstein R J, Lindzey G, Luce R D (eds.) Steens’ Handbook of Experimental Psychology, 2nd edn. Vol 1: Perception and Motiation. Wiley, New York Marks L E 1974 Sensory Processes: The New Psychophysics. Academic Press, New York and London Montgomery H 1975 Direct estimation: Effect of methodological factors on scale type. Scandinaian Journal Psychology 16: 19–29 Sellin J T, Wolfgang M E 1964 The Measurement of Delinquency. Wiley, New York Stevens S S 1961 To honor Fechner and repeal his law. Science 133: 80–6 Stevens S S 1975 Psychophysics: Introduction to its Perceptual, Neural, and Social Prospects. Wiley, New York Teghtsoonian R 1971 On the exponents in Stevens’s law and the constant in Ekman’s law. Psychology Reiew 78: 71–80 Teghtsoonian R 1974 On facts and theories in psychophysics: Does Ekman’s law exist? In: Moskowitz H R, Scharf B, Stevens J C (eds.) Sensation and Measurement: Papers in Honor of S. S. Steens. Reidel, Dordrecht, The Netherlands
M. Teghtsoonian
Sensation Seeking: Behavioral Expressions and Biosocial Bases Sensation seeking is a personality trait defined as the tendency to seek varied, novel, complex, and intense sensations and experiences and the willingness to take risks for the sake of such experience (Zuckerman 1979, 1994). The standard measure of the trait is the Sensation Seeking Scale (SSS). In the most widely used version (form V) it consists of four subscales: (a) Thrill and Adventure Seeking (through risky and unusual sports or other activities); (b) Experience Seeking (through the mind and senses, travel, and an unconventional lifestyle); (c) Disinhibition (through social and sexual stimulation, lively parties, and social drinking); (d) Boredom Susceptibility (aversion to lack of change and variety in experience and people). Form V uses a total score based on the sum of the four subscales. More recently a single scale called ‘Impulsive Sensation Seeking’ has been developed which is a combination of items measuring the tendency to act impulsively without planning ahead, and adaptations of items from the SSS which assess the general need for excitement without mention of specific interest or activities (Zuckerman 1993). Similar constructs and measures have been developed by other researchers including: change seek-
ing, stimulus variation seeking, excitement seeking, arousal seeking, novelty seeking and venturesomeness. Most of these scales correlate very highly with the SSS and usually have the same kind of behavioral correlates. Novelty seeking, a construct and scale devised by Cloninger (1987), not only correlates highly with Impulsive Sensation Seeking, but is based on a similar kind of biobehavioral model. Individuals high on novelty seeking are described as impulsive, exploratory, fickle, and excitable. They are easily attracted to new interests and activities, but are easily distracted and bored. Those who are low on this trait are described as reflective, rigid, loyal, stoic, frugal, orderly, and persistent. They are reluctant to initiate new activities and are preocccupied with details, and think very carefully and long before making decisions.
1. Theoretical and Research Origins of the Construct The first version of the SSS was an attempt to provide an operational measure of the construct ‘optimal levels of stimulation and arousal’ (the level at which one feels and functions best). The author was conducting experiments on sensory deprivation (SD) in which participants volunteered to spend some length of time (from 1 to 24 hours in the author’s experiments and up to two weeks in those of other investigators) in a dark soundproof room. It was theorized that persons with high optimal levels of stimulation would be most deprived by the absence of stimulation. Experiments show that the high sensation seeker became more restless over the course of an eight-hour experiment but did not experience more anxiety than the low sensation seekers (defined by the first version of the SSS). The construct changed as research using the SSS was extended to the broader domain of life experience.
2. Behaioral Expressions Contrary to what we expected, the volunteers for the SD experiments were more commonly high sensations seekers than lows. This was because they had heard that people had weird experiences in SD, like hallucinations. High sensation seekers volunteered for any kind of unusual experiments, like hypnosis or drug studies. They did not volunteer for more ordinary experiments. The concept of sensation seeking centered more around the need for novel stimulation or inner experiences rather than stimulation per se. Surveys of drug use and preferences showed that high sensation seekers tended to be polydrug users of both stimulant and depressant drugs whereas lows did not use any drugs (Segal et al. 1980). It was the novel experience of drugs, not their effect on arousal, that 13905
Sensation Seeking: Behaioral Expressions and Biosocial Bases attracted high sensation seekers. Of course hallucinatory drugs had a particular attraction for those who scored high on this trait. They were not deterred by the legal, social, or physical risks entailed in drug use. Sensation seeking in preadolescents predicts their later alcohol and drug use and abuse. Similarly sensation seekers proved to be attracted to risky sports that provided intense or novel sensations and experiences, like mountain climbing, parachuting, scuba diving, and car racing. They were not found in ordinary field sports or among compulsive exercisers. Outside of sports the high sensation seekers were found to drive their cars faster and more recklessly than lower sensation seekers. High sensation seekers are attracted to exciting vocations such as firefighting, emergency room work, flying, air-traffic control, and dangerous military assignments. When they are stuck in monotonous desk jobs they report more job dissatisfaction than low sensation seekers. In interpersonal relationships high sensation seekers tend to value the fun and games aspects rather than intimacy and commitment (Richardson et al. 1988). They tend to have more premarital sexual experience with more partners and engage in ‘risky sex.’ There is a high degree of assortative mating based on sensation seeking, i.e., highs tend to marry highs and lows tend to marry lows. Couples coming for marital therapy tend to have discrepant scores on the trait. Divorced persons are higher on the trait than monogamously marrieds. Not all expressions of sensation seeking are risky. High sensation seekers like designs and works of art that are complex and emotionally evocative (expressionist). They like explicit sex and horror films and intense rock music. When watching television they tend to frequently switch channels (‘channel surfing’). High sensation seekers enjoy sexual and nonsense types of humor (Ruch 1988). Low sensation seekers prefer realistic pastoral art pictures, media forms like situation comedies, and quiet background music.
3. Psychopathology Sensation seeking is an essentially normal trait and most of those who are very high or low on the trait are free from psychopathology. However, persons with certain kinds of disorders involving a lack of impulse control tend to be high sensation seekers. Sensation seeking is a primary motive for those with antisocial personality disorder; their antisocial behavior involves great risks where the only motive is sometimes the increase of excitement. Other disorders with a high number of sensation seekers include those with conduct disorders, borderline personality disorders, alcohol and drug abusers, and bipolar (manic-depressive) disorders. It has been discovered that there are genetic and biological trait links between many of 13906
these disorders and sensation seeking. For instance, as yet unaffected children of bipolars tend to be high sensation seekers as well as showing similar differences on an enzyme to be discussed.
4. Genetics Studies of twins raised in intact families show a relatively high degree of heritability (60 percent), compared to other personality traits that are typically in the range 30–50 percent. Analyses show no effects for the shared family environment; the environment that is important is that outside of the family which affects each twin differently. A study of twins separated shortly after birth and adopted into different families confirms these results (Hur and Bouchard 1997). A specific gene has been found to be related to novelty seeking (Ebstein et al. 1996). The gene produces one class of receptors for the neurotransmitter dopamine. One of the two major forms of the gene is found more often in high sensation seekers. This form of the gene has also been found in high percentages of heroin abusers, pathological gamblers, and children with attention deficit hyperactivity disorder. The neurotransmitter dopamine as well as the other two monoamines in the brain, norepinephrine and serotonin, are theorized to underlie the three behavioral mechanisms involved in sensation seeking: strong approach, and weak arousal and inhibition.
5. Biochemistry Males who are high in sensation seeking trait have high levels of the hormone testosterone compared to average levels in lower sensation seekers (Daitzman and Zuckerman 1980). This finding is consistent with the difference between men and women in the trait and the finding that sensation seeking peaks in the late teens and declines with age in both sexes. The enzyme monoamine oxidase (MAO) type B is lower in high sensation seekers than in those who score low on the trait. This is also consistent with age and gender differences since women are higher than men on MAO at all ages and MAO rises in the brain and blood platelets with age. Type B MAO is a regulator of the monoamines, particularly dopamine, and low levels imply a lack of regulation perhaps related to the impulsivity characteristic of many high sensation seekers. Low levels of MAO are also found in disorders characterized by poor behavioral control: attention deficit hyperactivity disorder, antisocial and borderline personality disorders, alcoholism, drug abuse, pathological gambling disorder, and mania (bipolar disorder). MAO is part of the genetic predisposition for these disorders as shown by the finding that the enzyme is low in as yet nonaffected children of alcoholics and those with bipolar disorder. Evidence
Sensation Seeking: Behaioral Expressions and Biosocial Bases of behavioral differences in newborn infants related to MAO levels also show the early effects on temperament. MAO differences are also related to behavioral traits in monkeys analogous to those of high and low sensation seeking humans.
6. Psychophysiology Differences in the psychophysiological responses of the brain and autonomic nervous system as a function of stimulus intensity and novelty have been found and generally replicated (Zuckerman 1990). The heart rate response reflecting orienting to moderately intense and novel stimuli is stronger in high sensation seekers than in lows, perhaps reflecting their interest in novel stimuli (experience seeking) and disinterest in repeated stimuli (boredom suceptibility). The cortical evoked potential (EP) reflects the magnitude of the brain cortex response to stimuli. Augmenting–reducing is a measure of the relationship between amplitude of the EP as a function of the intensity of stimuli. A high positive slope (augmenting) is characteristic of high sensation seekers (primarily those of the disinhibition type) and very low slopes, sometimes reflecting a reduction of response at the highest stimulus intensities (reducing), is found primarily in low sensation seekers. These EP augmenting–reducing differences have been related to differences in behavioral control in individual cats and strains of rats analogous to sensation seeking behavior in humans (Siegel and Driscoll 1996).
7. Eolution Comparative studies of humans and other species using the same biological markers suggest that the trait of impulsive sensation seeking has evolved in the mammalian line (Zuckerman 1984). Exploration and foraging is risky but adaptive in species that must move frequently to avoid exhaustion of the resources in an area. The balance between sensation seeking and fear determines exploration of the novel environment. Our own hominid species that came out of Africa and settled the entire earth in about 100,000 years had to have at least a moderate degree of sensation seeking. The hunting of large animals by men, warfare, and the seeking of mates outside of the group all involved risks which may have been overcome by the sensationseeking pleasure in such activity. Individual differences in the trait are seen in human infants before the major effects of socialization are seen. This suggests that impulsive sensation seeking is a basic trait related to individual differences in the approach, arousal, and inhibition behavioral mechanisms in humans and other species.
8. Future Directions One gene (DRD4) has been associated with sensation seeking but it only accounts for 10 percent of the genetic variance. The search for other major genes will continue and the understanding of what these genes do in the nervous system will fill in the large biological gap between genes and sensation-seeking behavior. Longitudinal studies that begin with genetic markers like the DRD4, biological markers like MAO, and behavioral markers like reactions to novelty, will be used to find out how specific environmental factors interact with dispositions to determine the expressions of sensation seeking, for instance why one sensation seeker becomes a criminal and another a firefighter who does skydiving on weekends.. See also: Gender Differences in Personality and Social Behavior; Genetic Studies of Personality; Personality and Crime; Personality and Risk Taking; Temperament and Human Development; Temperament: Familial Analysis and Genetic Aspects
Bibliography Cloninger C R 1987 A systematic model for clinical description and classification of personality variants. Archies of General Psychiatry 44: 573–88 Cloninger C R, Sigvardsson S, Bohman M 1988 Childhood personality predicts alcohol abuse in young adults. Alcoholism: Clinical and Experimental Research 12: 494–505 Daitzman R J, Zuckerman M 1980 Disinhibitory sensation seeking, personality, and gonadal hormones. Personality and Indiidual Differences 1: 103–10 Ebstein R P, Novick O, Umansky R, Priel B, Oser Y, Blaine D, Bennett E R, Nemanov L, Katz M, Belmaker R H 1996 Dopamine D4 receptor (D4DR) exon III polymorphism associated with the human personality trait, novelty seeking. Nature Genetics 12: 78–80 Hur Y, Bouchard T J Jr 1997 The genetic correlation between impulsivity and sensation seeking traits. Behaior Genetics 27: 455–63 Richardson D R, Medvin N, Hammock G 1988 Love styles, relationship experience, and sensation seeking: A test of validity. Personality and Indiidual Differences 9: 645–51 Ruch W 1988 Sensation seeking and the enjoyment of structure and content of humor: Stability of findings across four samples. Personality and Indiidual Differences 9: 861–71 Segal B S, Huba, G J, Singer J L 1980 Drugs, Daydreaming and Personality: Study of College Youth. Erlbaum, Hillsdale, NJ Siegel J, Driscoll P 1996 Recent developments in an animal model of visusal evoked potential augmenting\reducing and sensation seeking behavior. Neuropsychobiology 34: 130–5 Teichman M, Barnea Z, Rahav G 1989 Sensation seeking, state and trait anxiety, and depressive mood in adolescent substance users. International Journal of the Addictions 24: 87–9 Zuckerman M 1979 Sensation Seeking: Beyond the Optimal Leel of Arousal. Erlbaum, Hillsdale, NJ Zuckerman M 1984 Sensation seeking: A comparative approach to a human trait. Behaioral and Brain Sciences 7: 413–71
13907
Sensation Seeking: Behaioral Expressions and Biosocial Bases Zuckerman M 1990 The psychophysiology of sensation seeking. Journal of Personality 58: 313–45 Zuckerman M 1993 Sensation seeking and impulsivity: A Marriage of traits made in biology? In: McCown W G, Johnson J L, Shure M B (eds.) The Impulsie Client: Theory, Research, and Treatment. American Psychological Association, Washington, DC, pp. 71–91 Zuckerman M 1994 Behaioral Expressions and Biosocial Bases of Sensation Seeking. Cambridge University Press, New York
M. Zuckerman
Sensitive Periods in Development, Neural Basis of In all animals including man, the organization of brain and behavior is not fully determined by genetic information. Information from the environment, mediated by sensory organs, plays an important role in shaping the central nervous system (CNS) or a given behavior to its adult appearance. Many studies have shown that the environmental influence on development is not only different between species, but also varies along the time of development. In many cases, it is only a very short period of time where external stimulation affects the development. The same stimulus, given earlier or later in life, may have no effect or no visible effect. This phenomenon, a time limited influence of external stimulation on the wiring of the CNS or on the performance of a given behavior, has been called, with only subtle differences in meaning, ‘sensitive’ or ‘critical periods,’ or ‘sensitive phases.’ I shall use ‘sensitive period’ here just because it is the most frequent term.
1. Examples of Deelopmental Phenomena with Sensitie Periods 1.1 Imprinting and Song Learning Probably the best known example of early external influences of the environment on the organization of behavior is the so-called ‘imprinting’ process (Lorenz 1935), by which a young bird restricts its social preference to a particular animal or object. In the course of ‘filial imprinting,’ for example, a young chick or duck learns about the object that it has followed when leaving the nest (Hess 1973). Young zebra finches, in the course of ‘sexual imprinting’ (Immelmann 1972), learn the features of an object that subsequently releases courtship behavior in fullygrown birds. In addition to these two phenomena, many other paradigms of imprinting have been described, as for example homing in salmons, habitat 13908
imprinting, acoustic imprinting, or celestial orientation in birds. All imprinting phenomena are characterized by at least two criteria (Immelmann and Suomi 1982): First, learning about the object which the bird is following later on, or to which courtship behavior is directed, is restricted to a sensitive period early in development. In the case of filial imprinting, this phase is quite short (several hours) and it starts directly after hatching. In sexual imprinting, which has been investigated mainly in birds, which are hatching underdeveloped and with closed eyes, the sensitive period starts at the day of eye opening, and may last for several days. The second feature of all imprinting paradigms examined so far is that the information storage is rather stable. The preference for an object to follow or to court, which has been established in the course of the sensitive period, cannot be altered later on. Whenever the bird has a chance to choose between the imprinted object and another one, it will choose the familiar imprinted object. Song learning is, at the first glance, a little bit more complicated than imprinting. It has been shown to comprise two parts (Konishi 1965, Marler 1970). Early in life a young male bird (only males, in most avian species, sing) learns about the song he himself is singing later when adult, and he is learning mainly from his father. At the time of learning, the young male is not able to sing by himself. This time span is called the acquisition period, and it is thought that the male is storing some template of the song he has heard during this phase of learning. When the bird grows older, he starts singing by himself, and it has been shown that he tries to match his own song with the previously acquired template. During this ‘crystallization period,’ the young male selects its final song from a bigger set of other songs that he was singing at the beginning. Thereafter, this selected song remains stable and shows only minor variation. The other songs are, in most cases, not uttered any longer. Song learning thus shows the same characteristics as imprinting. It occurs during a sensitive period, and after the crystallization period, the song that is selected cannot easily be altered. Recent research indicated that a second event like crystallization could also be demanded in imprinting. At least formally, one can separate an acquisition period (which is the ‘classical’ sensitive period) and a second event that may be called stabilization also in imprinting (Bischof 1994). As already mentioned, development is an interplay between genetic instruction and acquired information. This is also the case in imprinting and song learning (Bischof 1997). In imprinting, not only the behavior for which the object is learned is genetically determined, there are also some genetic constraints which at least lead to advantages for certain objects to be learned easier than others. In filial imprinting, it has been shown for example that there is a natural preference for red over other colors, and this prefer-
Sensitie Periods in Deelopment, Neural Basis of ence can be enhanced or diminished by selective breeding. In song learning, only a small variety of songs can be learned by most species, and there is some indication that certain features of song are innate and not alterable by early learning.
1.2 Plasticity of the Visual Cortex in Cats The best-known example of sensitive periods in neural development is the plasticity of neurons of the visual cortex in cats (Hubel and Wiesel 1970). In the adult cat, most neurons in area 17 are driven by visual stimulation of the left as well as the right eye, and are thus defined as binocular. If one eye of a kitten is briefly sutured closed in its early postnatal life, the access of the eyes to cortical neurons is dramatically altered. There is an obvious lack of binocular neurons in the visual cortex of such kittens, and most of the neurons are driven exclusively or are at least strongly dominated by, the non-deprived eye. These changes in ‘ocular dominance distribution’ are only observed if monocular deprivation occurs during postnatal development; the same deprivation in an adult cat does not cause any change. Thus, there is a sensitive period during which the alteration of the visual input affects the wiring of neurons within the visual cortex. The wiring, which has been established after the end of this sensitive period, remains stable for the rest of life. Most of the results obtained in cats were later on confirmed and extended by research on other animals, including monkey, ferret, rat, and mouse (Berardi et al. 2000). As in imprinting and song learning, there is evidence that in addition to early learning, genetic instruction plays an important role for the organization of the visual cortex. The basic pattern of wiring is already there at birth, and this basic pattern is either stabilized during the sensitive period if the sensory information is adequate, or can be altered if the information coming from the eyes deviates from normal. For example, if one eye is closed as in the experiments above, or if the eyes are not aligned appropriately as it is the case in squinting.
2. Similarity of Eents with Sensitie Periods on the Behaioral and on the Neuronal Leel The short overviews in Sect. 1 already show that in addition to the fact that the environment affects the organization of brain and behavior only during a sensitive period, there are three features that are similar in all examples. First, there is some genetic preorganization that is maintained or altered by sensory information during the sensitive period. Second, the shape of the sensitive period over time is very similar in all cases: the effect of environmental information
increases quickly to a maximum, and becomes smaller much slower, going asymptotically to zero. Third, if the sensitive period is over, the brain structure or the behavior is not easily altered again. However, so far our comparison was only on phenomena, the effects may be due to very different mechanisms. It is therefore necessary to go into more detail and to discuss the mechanisms by which storage of information occurs in the different paradigms, and how the start, the duration and the end of sensitive periods, respectively, is determined. Sects. 2.1–2.3 will show that there is indeed a lot of similarity on the level of mechanisms.
2.1 Control of Sensitiity The control of the time span over which external information is able to affect brain and behavior, was initially thought to be due to a genetically determined window which was opened for some time during development allowing external information to access the CNS. However, this turned out to be too simple an idea because the environmental sensitivity could be shifted in time due to experimental conditions (Bateson 1979). Dark rearing, for example, delays the sensitive period during which ocular dominance can be shifted by monocular deprivation, and the sensitive period for imprinting lasts longer if the young bird is isolated and thus not able to see the appropriate stimulus which leads to imprinting. In imprinting, one can also show that the sensitive period is prolonged if the stimulus is not optimal. For example, exposing a zebra finch male on another species like a Bengalese finch leads to a prolongation of the sensitive period. The ideas, which were raised to explain these phenomena, were as follows (Bischof 1997): the natural onset of the sensitive periods coincides with the functioning of the sensory systems involved. Thus, the sensitive period for sexual imprinting in zebra finches starts with about 10 days when the eyes are fully open, while in the precocial chicks, filial imprinting starts directly after birth because these birds are born with open eyes. The sensitive period for monocular deprivation should also start at eye opening. This is roughly correct, but the effect is quite low at the beginning, and some recent results indicate that eye opening may trigger some intermediate events that then lead to enhanced sensitivity of the affected area of the brain. Why does the sensitive period have a time course with a quite sharp increase to a maximum of sensitivity, but a slow, asymptotic decrease? One idea is that the sensitive period is some self-terminating process. If we suppose that an object is described by a limited amount of information bits, or, on the storage side, there is a limited store for the information which has to be acquired, it is easy to imagine that the probability for storage of a given bit of information is 13909
Sensitie Periods in Deelopment, Neural Basis of high at the beginning and goes asymptotically to zero, dependent on the amount of information already stored (Bischof 1985).
2.2 Sites and Modes of Storage For all the examples of phase specific developmental phenomena in Sect. 1, the locations within the brain are known where the plastic events can be observed. Therefore, it is possible to compare the changes of wiring between different examples. It was the visual cortex plasticity where it was detected first that the anatomical basis for the development of neuronal specificity was a segregation of previously overlapping neuronal elements within area 17 (LeVay et al. 1978). Thus, the specification of neurons was caused by a reduction of preexisting neuronal elements. This principle was also found in imprinting and song learning. In both paradigms, the spine density of neurons within the areas that are involved in the learning process was substantially reduced in the course of the sensitive period (Bischof 1997). This indicates that pruning of preexisting elements is an essential part of the physiological mechanisms underlying phase dependent developmental plasticity and learning. The reduction of spine density is stable thereafter; it can not be enhanced by any treatment when it has occurred once. This is also an indication that the reduction of spines may be the anatomical basis of imprinting like learning and developmental plasticity.
2.3 Ultrastructural Eents It has long been speculated that the machinery causing learning induced changes during imprinting, song learning, and cortical plasticity may be different from that causing changes in adult learning, because there is a decrease in spine density instead of an increase, and the changes are stable and cannot be reversed. However, concerning the basic machinery, no significant differences were found. Developmental learning can obviously also be explained by ‘Hebbian’ synapses which strengthen their connections if pre- and postsynaptic neurons fire together, and disconnect if the activity is asynchronous. To cause changes in the postsynaptic neuron, NMDA receptors are involved as in other learning paradigms, causing Long Term Potentiation (LTP) or Long Term Depression (LTD) as well as the cascades of second messengers which finally lead to the activation of the genome which then causes long term changes in synaptic efficiency. The difference to adult learning obviously lies in the fact that plasticity is limited to a certain time span. Many ideas have been developed which systems may gate plasticity (Katz 1999). One of the earliest ideas was that myelination delimits plasticity. Unspecified projections, adrenergic, serotoninergic or cholinergic 13910
have been shown to play a role where it was investigated. Recent links from experiments with knockout animals point towards neurotrophic agents which may only be available to a limited amount; when the resource is exhausted, plasticity is no longer possible (Berardi and Maffei 1999). Another very interesting finding is that inhibition plays a role (Fagiolini and Hensch 2000); neurons may have to reach a genetically determined balance of inhibition and excitation to become plastic. Whether these latter ideas can also be applied to the other early learning paradigms has to be examined.
3. Is Generalization to Humans Possible? One has to be very careful if one generalizes examples from one species to another, and this is even more true for generalization from animals to humans. However, there are some hints that at least part of the results described in Sect. 2 can be applied to humans (Braun 1996). On the neuronal level, it is globally accepted that amblyopia, a visual deficit, is based on the mechanisms described for the development of the visual cortex. It has been shown that if one corrects in humans the misalignment of the eyes which causes amblyopia during early development, the connection between eyes and cortical neurons, and this is no longer possible in adults (Hohmann and Creutzfeldt 1975). On the behavioral level, it has been shown that language learning has so much in common with song learning (Doupe and Kuhl 1999) that it is intriguing to speculate that the neuronal machinery may be similar in both paradigms. However, one has to be aware that the similarity is as yet only on the phenomenological level. Since the early days of imprinting research, it has also been discussed whether aggressiveness, social competence, and similar things are imprinted (Leiderman 1981), and whether this can also be applied to humans. The frightening idea was that in this case parents could easily make big mistakes if they did not confront their children with the appropriate surrounding. However, evidence is sparse even in animals that social competence is severely influenced by early experience. If there is an influence, ways are available, even in the case of sexual imprinting in the zebra finch, to at least overcome temporarily the effects of imprinting. However, that imprinting effects are in most cases only covered but not eliminated, may be reason enough to pay some attention to the conditions under which children grow up. See also: Birdsong and Vocal Learning during Development; Brain Development, Ontogenetic Neurobiology of; Neural Plasticity in Visual Cortex;
Sentence Comprehension, Psychology of Prenatal and Infant Development: Overview; Visual Development: Infant
Bibliography Bateson P 1979 How do sensitive periods arise and what are they for? Animal Behaiour 27: 470–86 Berardi N, Maffei L 1999 From visual experience to visual function: Roles of neurotrophins. Journal of Neurobiology 41: 119–26 Berardi N, Pizzorusso T, Maffei L 2000 Critical periods during sensory development. Current Opinion Neurobiology 10: 138– 45 Bischof H-J 1985 Environmental influences on early development: A comparison of imprinting and cortical plasticity. In: Bateson P P G, Klopfer P H (eds.) Perspecties in Ethnology, Vol. 6. Mechanisms. Plenum Press, New York, pp. 169–217 Bischof H-J 1994 Sexual imprinting as a two-stage process. In: Hogan J A, Bolhuis J J (eds.) Causal Mechanisms of Behaioural Deelopment. Cambridge University Press, Cambridge, UK, pp. 82–7 Bischof H-J 1997 Song learning, filial imprinting, and sexual imprinting: Three variations of a common theme? Biomedical Research–Tokyo 18(Suppl. 1): 133–46 Braun K 1996 Synaptic reorganization in early childhood experience and learning processes: Relevance for the development of mental diseases. Zeitschrift fuW r Klinische Psychologie Psychiatrie und Psychotherapie 44: 253–66 Doupe A J, Kuhl P K 1999 Birdsong and human speech: Common themes and mechanisms. Annual Reiew of Neuroscience 22: 567–631 Fagiolini M, Hensch T K 2000 Inhibitory threshold for criticalperiod activation in primary visual cortex. Nature 404: 183–6 Hess E H 1973 Imprinting: Early Experience and the Deelopmental Psychobiology of Attachment. Van Nostrand Reinhold, New York Hohmann A, Creutzfeldt O D 1975 Squint and the development of binocularity in humans. Nature 254: 613–4 Hubel D H, Wiesel T N 1970 The period of susceptibility to the physiological effects of unilateral eye closure in kittens. Journal Physiology (London) 206: 419–36 Immelmann K 1972 The influence of early experience upon the development of social behaviour in estrildine finches. Proceedings of the 15th International Ornithological Congress. The Hague 316–38 Immelmann K, Suomi S J 1982 Sensitive phases in development. In: Immelmann K, Barlow G W, Petrinovich L, Main M (eds.) Behaioural Deelopment. Cambridge University Press, Cambridge, UK, pp. 395–431 Katz L C 1999 What’s critical for the critical period in visual cortex? Cell 99: 673–6 Konishi M 1965 The role of auditory feedback in the control of vocalization in the white-crowned sparrow. Zeitschrift fuW r Tierpsychologie 22: 770–83 Leiderman P H 1981 Human mother-infant social bonding: Is there a sensitive phase? In: Immelmann K (ed.) Behaioral Deelopment. The Bielefeld Interdisciplinary Project. Cambridge University Press, Cambridge, UK Le Vay S, Stryker M P, Shatz C J 1978 Ocular dominance columns and their development in layer IV of the cat’s visual cortex: A quantitative study. Journal of Comparatie Neurology 179: 223–44
Lorenz K 1935 Der Kumpan in der Umwelt des Vogels. Journal fuW r Ornithologie. 83: 137–213, 289–413 Marler P 1970 A comparative approach to vocal learning: Song development in white-crowned sparrows. Journal of Comparatie and Physiological Psychology. 71(Suppl.): 1–25
H.-J. Bischof
Sentence Comprehension, Psychology of In the process of mapping from form (speech or printed text) to meaning, listeners and readers have the task of combining individual word meanings into sentence meanings. This article examines the cognitive challenges posed by this task, describes some experimental techniques that psycholinguists have used to understand the task, summarizes some basic empirical phenomena of sentence comprehension, and surveys the range of cognitive theories that have been developed to explain how people comprehend sentences.
1. The Tasks of Sentence Comprehension Listeners, and skilled readers, recognize individual words in a manner which appears effortless but which actually hides a wealth of complex cognitive processes (e.g., see Lexical Access, Cognitie Psychology of; Word Recognition, Cognitie Psychology of). They can go further and identify the message being conveyed by a discourse, they can create a mental model of the scenario being described, and they can engage in further social interaction with the person who produced the words. The gap between word and message is bridged by the process of sentence comprehension. A sentence conveys meaning. The meaning of a sentence is composed out of the meanings of its words, guided by the grammatical relations that hold between the words of the sentence. The psychology of sentence comprehension is concerned with the cognitive processes that permit a reader or listener to determine how the word meanings are to be combined in a way that satisfies the intention of the writer or speaker. The reader\listener’s task of mapping from print or sound to meaning is not trivial. Small differences in the input can make great changes in meaning. Consider the contrast between ‘Tom gave his dog biscuits’ and ‘Tom gave him dog biscuits.’ The difference between ‘his’ and ‘him’ signals a difference in the ‘case’ of the pronoun, an aspect of its morphology that carries information about the relation it has to a verb or some other ‘case-assigning’ word in a sentence. The form ‘him’ signals the accusative case, which means that the pronoun has some immediate relation to the verb, in this case the indirect object or ‘recipient’ of the action denoted by the verb. The form ‘his,’ on the other hand, signals the possessive or ‘genitive’ case, which means 13911
Sentence Comprehension, Psychology of that the pronoun is immediately related to the following noun ‘dog.’ This whole noun phrase in turn takes the recipient role. Morphologically signaled case actually plays a minor role in English, although it plays a very important role in many other languages. Languages like English signal the structural relations between their words primarily by word order (compare ‘The dog bit the man’ and ‘The man bit the dog’), and English, like all languages, signals structural relations by the choice of particular lexical items (compare ‘The man helped the child to first base’ and ‘The man helped the child on first base’). Listeners and readers must be sensitive to subtle pieces of information about the structure of sentences. Their task is complicated by the fact that they must also be sensitive to structural relations that span arbitrarily long distances. Consider sentences like ‘The boy likes the girl,’ ‘The boy from the small town likes the girl,’ ‘The boy from the small town where I grew up likes the girl,’ and so forth. There is no limit to the amount of material that can be included in the sentence to modify ‘the boy.’ Nonetheless, the form of the final verb, ‘likes,’ must agree in number with the initial subject, ‘the boy.’ Or consider a sentence like ‘Which girl did the boy from the small town … like?’ The initial question phrase, ‘which girl,’ must be understood to be the direct object of the final verb ‘like,’ even though the two phrases can be separated by an arbitrarily long distance. Readers and listeners (and writers and speakers) are clearly sensitive to such ‘long distance dependencies’ (cf. Clifton and Frazier 1989, Fodor 1978). A third problem faced by listeners and readers is the ubiquitous ambiguity (temporary or permanent) of language. Ambiguity begins in the speech stream, which often can be segmented at different points into different words (see, e.g., Speech Perception, see also Cutler et al. 1997). A given word form can correspond to different lexical concepts (e.g., the money vs. river meaning of ‘bank’). And a string of words can exhibit different structural relations, which may or may not be resolved by the end of a sentence (consider, e.g., the temporary ambiguity in ‘I understood the article on word recognition was written by a real expert’). Readers and listeners are generally not aware of such ambiguities (although puns force awareness), but that only means that their cognitive processes adequately resolve the ambiguities in the course of sentence comprehension. 1.1 The Role of Psycholinguistics in the Deelopment of Cognitie Psychology The modern study of sentence comprehension began at the start of the cognitive revolution, in the late 1950s and early 1960s. Prior to this time, complex behaviors were claimed to be the result of (possibly complex) stimulus–response contingencies. Sentences were presumed to be produced and understood by chaining 13912
strings of words together, under the control of environmental and internal stimuli. Psycholinguistic phenomena of sentence comprehension (and language acquisition; see e.g., Language Acquisition) demonstrated the inadequacy of stimulus–response chaining mechanisms. The sheer fact that we are always understanding and producing completely novel sentences was hard enough to explain in a stimulus–response framework, but even sharper arguments against behaviorism were constructed. Consider just one example. Long-distance dependencies, discussed above, could not be analyzed in terms of stimulus–response chains. If stimuli are chains of words, then the effective stimulus that links a ‘which’ question phrase with the point in the sentence where it has to be interpreted (often referred to as a ‘gap’) would have to be different for each added word that intervenes between the ‘which’ word (‘stimulus’) and its gap (‘response’). In the limit, this means that an arbitrarily large (potentially infinite) number of stimuli would have been associated with a response, which is not possible within stimulus–response theory. 1.2 The Role of Linguistic Knowledge in Sentence Comprehension Arguments like the one just given were among the strongest reasons to develop a psychology of cognitive processes (see Miller 1962). Cognitive psychologists took arguments like this to support the claim that the mind had to have processes that operated on structures more abstract than stimulus–response chains. Many psycholinguists took the new generative grammars being developed by Chomsky (1957, see also Chomsky 1986) to provide descriptions of the structures that cognitive processes could operate on. One important structure is phrase structure, the division of a sentence into its hierarchically arranged phrases and the labeling of these phrases. For example, ‘I understood the article on word recognition was written by an expert’ is divided into the subject ‘I’ and the verb phrase ‘understood … expert’; this verb phrase is divided into the verb ‘understood’ and its embedded sentence complement ‘the article … was written by an expert’; this sentence is similarly divided into subject and verb phrase, and so forth. Another structure (identified only after several years of development of linguistic theory) is the long-distance dependency between ‘moved items’ or ‘fillers’ and their ‘traces’ or ‘gaps’ (as in the relation between a ‘which’ phrase and its gap, discussed earlier). Still other structures involve case relations (e.g. the distinction between accusative and genitive case discussed earlier) and thematic role relations (e.g., the distinction between theme or affected object and recipient in ‘Tom gave his dog biscuits’). Early psycholinguists devoted much attention to the ‘psychological reality’ of grammatical structures, arguing whether they are really involved in sentence
Sentence Comprehension, Psychology of comprehension (Fodor et al. 1974). As psycholinguistic theory developed, it became apparent that the real debate involves not whether these structures are real but how they are identified in the course of sentence comprehension and how, in detail, the mind creates and represents them (Frazier 1995). It now seems impossible to explain how we understand sentences without theorizing about how people assign structure to sentences. However, as we will see later, there is very lively debate about just how we do identify sentence structure.
2.
Ways of Studying Sentence Comprehension
Sentence comprehension seems almost effortless and automatic. How can one observe the fine-grain details of such a smoothly flowing process? Early psycholinguists focused on what was remembered once a sentence was comprehended. They learned some interesting things. Grammatical structure can predict which sentences will be confused in recall and recognition and it can predict what words from a sentence will be good recall probes. While these findings supported the ‘psychological reality’ of grammatical structures, other findings indicated that the gist of a sentence was typically more salient in memory than its specific form or even its grammatical structure. (See Tanenhaus and Trueswell 1995 for a more detailed review.)
psycholinguists have developed ways to assess just what a reader or listener might be thinking about at any given time. One can interrupt a spoken sentence with a probe (e.g., a word to name or to recognize from earlier in the sentence) and use probe reaction times to make inferences about how highly activated the probe word is. For example, probes for fillers such as ‘girl’ in ‘Which girl did the boy from the small town like’ are sometimes observed to be faster after the word that assigns it a thematic role (‘like’) than at other points in the sentence, as if the filler was reactivated at the gap. (Note that this technique is methodologically tricky and can mislead a researcher if it is misused; McKoon et al.1994.) One can also measure ERPs (event-related brain potentials, electrical traces of underlying brain activities measured at the scalp) at critical points in sentences, for example, when a sentence becomes implausible or ungrammatical or simply difficult to process (Kutas and van Petten 1994). There appear to be distinct signatures of different types of comprehension disruption. For instance, implausibility or violation of expectations commonly leads to an ‘N400,’ a negative-going potential shift peaking about 400 ms. after the introduction of the implausibility. This allows a researcher to test theories about just when and how disruption will appear in hearing or reading a sentence.
3. Phenomena of Sentence Comprehension 3.1 Clausal Units
2.1 Online Measures Showing that the end state of comprehension is gist tells us little about how meaning is arrived at. The early psycholinguists recognized this, and developed ‘online’ tasks to probe the early stages of sentence comprehension. Some used a technique of sentence verification, measuring the time taken to decide whether a sentence was true or false (e.g., when spoken of a picture). Others (e.g., Just et al. 1982, see also Haberlandt 1994) measured the time readers took to read each word or phrase of a sentence when sentence presentation was under their control (pressing a button brings on each new word), thereby getting a more precise look at the difficulty readers experience word by word and allowing the development of theories about how readers construct interpretations of sentences. The development of techniques for measuring eye movements during reading (Rayner et al. 1989; see also Eye Moements in Reading), coupled with the discovery that the eye generally seems to be directed toward whatever input the reader is processing, allowed even more sensitive and less disruptive ways of measuring the course of reading. In addition to devising techniques for measuring which parts of sentences readers found easy or hard,
Early theories of processing, guided by linguistic analyses that posited ‘transformational’ rules whose domains could be as large as a full clause, proposed that full sentence analysis and interpretation took place only at a clause boundary. These theories were supported by evidence that memory for verbatim information declines across a clause boundary, that clauses resists interruption by extraneous sounds, and that readers pause at ends of clauses and sentences to engage in ‘wrap-up’ processes (Fodor et al. 1974, Just and Carpenter 1980).
3.2 Immediacy However, a great deal of evidence now exists indicating that readers do substantial interpretation on a wordby-word basis, without waiting for a clause boundary. The initial evidence for this claim came from MarslenWilson’s (1973) work on ‘fast shadowing’ (repeating what one hears, with no more than a quarter-second delay). He showed that if the shadowed message contained a mispronunciation of a word’s end (e.g., ‘cigareppe’), the shadower would fairly often correct it spontaneously. Since this happened more often in 13913
Sentence Comprehension, Psychology of normal than in scrambled sentences, the listener must be using grammatical structure or meaning within a quarter-second or so to constrain recognition of words. Evidence from measuring eye movements in reading led to a similar conclusion (see Eye Moements in Reading). For instance, if a person reads ‘While Mary was mending the sock fell off her lap,’ the reader’s eyes are likely to fixate a long time on the word ‘fell’ or to regress from that word to an earlier point in the sentence. This happens because the verb ‘fell’ is grammatically inconsistent with the normally preferred structural analysis of ‘ … mending the sock’; ‘the sock’ must be the subject of ‘fell,’ not the object of ‘mending.’ Similarly, a person who reads ‘When the car stopped the moon came up’ is likely to fixate a longer than normal time on ‘moon.’ Here, ‘moon’ is simply implausible as the direct object of ‘stop,’ not ungrammatical. This pattern of results indicates that readers (and presumably listeners) create and evaluate grammatical and meaning relations word by word, without waiting for a clause boundary. It is tempting to hypothesize that readers and listeners always perform a full semantic analysis of sentences word by word, with never a delay. This ‘immediacy hypothesis’ may be too strong. Some evidence indicates that, while true lexical ambiguities (e.g., the difference between the bank of a river and a bank for money) may be resolved word by word, sense ambiguities (e.g., the difference between a newspaper as something to read and a newspaper as an institution) may not be (Frazier and Rayner 1990). Similarly, determining the antecedent of a pronoun may be a task that readers do not always complete at the first opportunity. Nonetheless, one secure conclusion is that readers and listeners generally understand a great deal of what a sentence conveys with little or no lag.
A simple empirical generalization (which may or may not describe the cognitive process a reader or listener is engaging in; see below) is that when a choice between two analyses has to be made, the reader\listener initially favors (a) the choice that is simpler in terms of grammatical structure, or (b) in the absence of complexity differences, the choice that permits new material to be related to the most recently processed old material. In the ‘horse raced’ sentence, the main verb analysis is simpler than the relative clause analysis. In ‘While Mary was mending the sock fell off her lap,’ a reader\listener prefers to relate ‘the sock’ to the recently processed verb ‘mending’ rather than to material that has not yet been received. In both cases, these preferences are disconfirmed by material that follows, resulting in a garden-path. A great deal of experimental work using techniques of self-paced reading and eye movement measurement has indicated that reading is slowed (and regressive eye movements are encouraged) at points where gardenpaths occur. Similarly, experimental research using ERPs has indicated the existence of distinct patterns of neural response to being garden-pathed. Research using these, and other, techniques has gone further and demonstrated the existence of subtle garden-paths that may not be apparent to conscious introspection (e.g., the temporary tendency to take ‘the article … ’ as the direct object of ‘understand’ in ‘I understood the article on word recognition was written by a real expert’). Normally, it appears that the rules readers and listeners use to decide on the grammatical structures of sentences function so smoothly that their operation cannot be observed. But just as the analysis of visual illusions allows insight into the normal processes of visual perception by identifying when they go astray, identifying when a rule for analyzing sentence structure gives the wrong answer can go far in determining what rule is actually being followed.
3.3
3.4 Lexical and Frequency Effects
Garden-pathing
Another much-studied phenomenon is ‘gardenpathing.’ When readers and listeners construct wordby-word interpretations of sentences, they sometimes make mistakes. Bever (1970) initiated the study of garden-pathing with his infamous sentence ‘The horse raced past the barn fell’ (compare ‘The car driven into the garage caught fire’ if you doubt that the former sentence is actually possible in English). Readers are ‘led down the garden path’ by their preference to take ‘horse’ as the agent of ‘race,’ and is disrupted when the final verb ‘fell’ forces a revision so that ‘raced past the barn’ is taken as a relative clause that modifies ‘horse,’ and ‘horse’ is taken to be the theme of ‘race’ and the subject of ‘fell.’ The mistakes readers and listeners make can be extremely diagnostic of the decision rules they follow. 13914
Once the existence and basic nature of garden-paths were discovered, researchers realized that gardenpaths did not always occur. The sentence presented earlier, ‘The car driven into the garage caught fire’ does not seem to lead to a garden-path. This may be due in part to the fact that the verb ‘driven’ is unambiguously a participle, not a main verb, and the normally preferred simpler analysis is grammatically blocked. However, other cases indicate that more subtle effects exist. Sentences with verbs that are obligatorily transitive are relatively easy (e.g., ‘The dictator captured in the coup was hated’ is easier than ‘The dictator fought in the coup was hated’). Sentences with verbs like ‘cook,’ whose subject can take on the thematic role of theme\affected object, are particularly easy (‘The soup cooked in the pot tasted good’).
Sentence Comprehension, Psychology of Sentences in which the subject is implausible as agent of the first verb but plausible as its theme are easier than sentences in which subject is a plausible agent (e.g., ‘The evidence examined by the lawyer was unreliable’ is easier than ‘The defendant examined by the lawyer was unreliable,’ although this difference may depend on a reader being able to see the start of the disambiguating ‘by’ phrase while reading the first verb). Consider a different grammatical structure, prepositional phrase attachment. It is easier to interpret a prepositional phrase as an argument of an action verb than as a modifier of a noun (e.g., ‘The doctor examined the patient with a stethoscope’ is easier than ‘The doctor examined the patient with a broken wrist’), which is consistent with a grammatical analysis in which the former sentence is structurally simpler than the latter. However, this difference reverses when the verb is a verb of perception or a ‘psychological’ verb and the noun that may be modified is indefinite rather than definite (e.g., ‘The salesman glanced at a customer with suspicion’ is harder than ‘The salesman glanced at a customer with ripped jeans’) (Spivey-Knowlton and Sedivy 1995; see also Frazier and Clifton 1996, Mitchell 1994, Tanenhaus and Trueswell 1995, for further discussion and citation of experimental articles). These results indicate that grammatical structure is not the only factor that affects sentence comprehension. Subtle details of lexical structure do as well. Further, the sheer frequency with which different structures occur and the frequency with which particular words occur in different structures seems to affect sentence comprehension: For whatever reason, more frequent constructions are easier to comprehend. For example, if a verb is used relatively frequently as a participle compared to its use as a simple past tense, the difficulty of the ‘horse raced’ type garden-path is reduced. Similarly, the normal preference to take a noun phrase following a verb as its direct object (which leads to difficulty in sentences like ‘I understood the article on word recognition was written by an expert’) is reduced when the verb is more often used with sentence complements than with direct objects. MacDonald et al. (1994) review these findings, spell out one theoretical interpretation which will be considered shortly, and discuss the need to develop efficient ways of counting the frequency of structures and to decide on the proper level of detail for counting structures.
of possible referents. For instance, if there are two books on a table, you can ask for the ‘book that has the red cover.’ Some researchers (e.g., Altmann and Steedman 1988) have suggested that the difficulty of the relative clause in a ‘horse raced’ type garden-path arises simply because the sentence is presented out of context and there is no set of referents for the relative clause to select from. This suggestion entails that the garden-path would disappear in a context in which two horses are introduced, one of which was raced past a barn. Most experimental research fails to give strong support to this claim (cf. Mitchell 1994, Tanenhaus and Trueswell 1995). However, the claim may be correct for weaker garden-paths, for example the prepositional phrase attachment discussed above (‘The doctor examined the patient with a broken wrist’), at least when the verb does not require a prepositional phrase argument. And a related claim may even be correct for relative clause modification when temporal relations, not simply referential context, affect the plausibility of main verb vs. relative clause interpretations (cf. Tanenhaus and Trueswell 1995).
3.6
Effects of Prosody
Most research on sentence comprehension has focused on reading. However, some experimental techniques permit the use of spoken sentences, which raises the possibility of examining whether the prosody of a sentence (its rhythm and melody) can affect how it is comprehended. The presence of a prosodic boundary (marked in English by a change in pitch at the end of the prosodic phrase, elongation of its final word, and possibly the presence of a pause) can affect the interpretation of a sentence. It can even eliminate the normal preferences that result in garden-paths. For example, Kjelgaard and Speer (1999) showed that the proper pattern of prosodic boundaries can eliminate the normal difficulty of sentences like ‘Whenever Madonna sings a song is a hit.’ As these authors emphasize, though, there is more to prosody than just putting in pauses. The entire prosodic pattern of an utterance (including the presence and location of ‘pitch accents’; Schafer et al. 1996) can affect how it is interpreted.
4. Psychological Models of Sentence Comprehension 3.5 Effects of Context Sentences are normally not interpreted in isolation, which raises the question of how context can affect the processes by which they are understood. One line of research has emphasized the referential requirements of grammatical structures. One use of a modifier, such as a relative clause, is to select one referent from a set
Psycholinguists have learned a great deal about the phenomena of sentence comprehension. This does not mean that they agree about the cognitive processes of readers and listeners that produce these phenomena. Early theories of how people understand sentences claimed that the rules that make up a linguistic theory of a language are known (implicitly) to language users, 13915
Sentence Comprehension, Psychology of who use the rules directly in comprehending language (see Fodor et al. 1974). This did not work with the rules provided by early transformational grammars. Their rules operated on domains as large as a clause, and it soon became clear that people did not wait for the end of a clause to understand a sentence. These early direct incorporation theories were replaced by what can be called ‘detective-style’ theories. Readers and listeners were presumed to search for clues of any kind to the relations among words in a sentence. Psycholinguistic theorizing became little more than listing cues that people used. In the 1970s, as grammars changed to use more restrictive types of rules (especially emphasizing phrase-structure rules), a new breed of grammarbased theories developed. These theories added descriptions of decision strategies to the claim that people use the rules provided by the grammar, focusing on how they choose among various alternative rules in analyzing a sentence (Frazier 1987; cf. Frazier and Clifton 1996, Mitchell 1994, for further discussion of these theories). They claimed, for instance, that readers and listeners incorporate each new word into a sentence structure in the simplest and quickest possible way. This led to compelling accounts of garden-path phenomena, and the theories under discussion are often referred to as ‘garden-path’ theories. In the 1970s and 1980s, linguistic theory moved away from positing broad, generally applicable rules to making claims about information contained in individual lexical items (constrained by some extremely broad general principles). This encouraged the development of ‘lexicalist’ theories of sentence comprehension (see MacDonald et al.1994, for the most completely developed statement of such a theory). These theories emphasize the specific contribution of individual words, rather than the effects of conformity to generally applicable word configurations. An individual word can provide a variety of kinds of information, including the possible and preferred thematic roles for its arguments, the frequency with which it is used in various constructions, and the entities it can refer to, as well as information about the phrase structure configurations it can appear in. Lexicalist theories propose that all these kinds of information are available and are used in deciding among alternative analyses. It is clear that contemporary theories of sentence comprehension differ in the range of information that they claim guides sentence analysis. Garden-path theories are ‘modular’ (cf. Fodor 1983) in that they claim that only certain necessarily relevant types of information affect initial decisions about sentence structure. Lexicalist theories are generally nonmodular in allowing many different kinds of information to affect these decisions. The theories also differ in what can be called depth-first vs. breadth-first processing (see Clifton 2000, for discussion). Garden13916
path theories are typically depth-first (although logically they need not be); they typically claim that a single analysis is built initially and then evaluated. Lexicalist theories are typically breadth-first; they claim that several different analyses are activated in parallel (generally being projected from the head of a phrase), and that all available types of information are used to choose among them. Experimental data have thus far not been able to settle the debate between theorists favoring depth-first models and theorists favoring breadth-first ones. Garden-path theories explain garden-paths elegantly, and provide a more adequate account of how sentence structures are created than lexicalist theories do. On the other hand, they are forced to ascribe the effects of lexical structure, frequency, and context described above to a ‘reanalysis’ stage, which follows discovery that the initially preferred analysis is incorrect. Lexicalist theories, on the other hand, provide a natural account of effects of lexical structure, frequency, and context, especially when implemented as connectionist constraint-satisfaction models. But they have not yet been developed in a way that adequately explains where sentence structures come from (it is unappealing to say that the large parts of the grammar are stored with each lexical entry) or that explains why structurally complex sentences are overwhelmingly the ones that lead to garden-paths.
5. Beyond Parsing This article has focused on how readers and listeners use their knowledge of language to compose sentence meanings from words. While much of the power of language comes from readers’ and listeners’ abilities to ‘follow the rules’ and arrive at grammatically licensed interpretations of sentences, it is also clear that people have the ability to go beyond such literal interpretation. The context in which an utterance is heard or a sentence is read can influence its interpretation far more profoundly than the context effects discussed above. The meaning of words can be altered depending on the context in which they occur (consider ‘bank’ in the context of a trip to the river to fish vs. a purchase of a house, or ‘red’ in the context of a fire truck vs. a red-haired woman). The reference of a definite noun phrase can depend the discourse context. A listener can determine whether a speaker who says ‘Can you open this door?’ is asking about the adequacy of a job of carpentry vs. requesting some assistance. Speakers can rely on listeners to get their drift when they use figurative language such as metaphors and possibly irony (‘My copy editor is a butcher’). And very generally, what a listener or reader takes a speaker or writer to mean may depend on their shared goals and mutual knowledge (cf. Clark 1996). Nonetheless, all these varied uses of language depend on the listener\reader’s ability to put words
Sequential Decision Making together into sentence meanings, which makes the study of the topics considered in this article an important branch of cognitive psychology. See also: Eye Movements in Reading; Knowledge Activation in Text Comprehension and Problem Solving, Psychology of; Syntactic Aspects of Language, Neural Basis of; Syntax; Syntax–Semantics Interface; Text Comprehension: Models in Psychology
Bibliography Altmann G, Steedman M 1988 Interaction with context during human sentence processing. Cognition 30: 191–238 Bever T G 1970 The cognitive basis for linguistic structures. In: Hayes J R (ed.) Cognition and the Deelopment of Language. Wiley, New York, pp. 279–352 Chomsky N 1957 Syntactic Structures. Mouton, The Hague, The Netherlands Chomsky N 1986 Knowledge of Language: Its Nature, Origin, and Use. Praeger, New York Clark H H 1996 Using Language. Cambridge University Press, Cambridge, UK Clifton Jr C 2000 Evaluating models of sentence processing. In: Crocker M, Pickering M, Clifton Jr C (eds.) Architecture and Mechanisms of Language Processing. Cambridge University Press, Cambridge, UK, pp. 31–55 Clifton Jr C, Frazier L 1989 Comprehending sentences with long-distance dependencies. In: Carlson G N, Tanenhaus M K (eds.) Linguistic Structure in Language Processing. Kluwer Academic Publishers, Dordrecht, The Netherlands, pp. 273–318 Cutler A, Dahan D, Van Donselaar W 1997 Prosody in the comprehension of spoken language: A literature review. Language and Speech 40: 141–201 Fodor J A 1983 Modularity of Mind. MIT Press, Cambridge, MA Fodor J A, Bever T G, Garrett M F 1974 The Psychology of Language: An Introduction to Psycholinguistics and Generatie Grammar. McGraw-Hill, New York Fodor J D 1978 Parsing strategies and constraints on transformations. Linguistic Inquiry 9: 427–74 Frazier L 1987 Sentence processing: A tutorial review. Attention and Performance 12: 559–86 Frazier L 1995 Representational issues in psycholinguistics. In: Miller J L, Eimas P D (eds.) Handbook of Perception and Cognition, Vol. 11: Speech, Language, and Communication. Academic Press, San Diego Frazier L, Clifton Jr C 1996 Construal. MIT Press, Cambridge, MA Frazier L, Rayner K 1990 Taking on semantic commitments: Processing multiple meanings vs. multiple senses. Journal of Memory and Language 29: 181–200 Haberlandt K F 1994 Methods in reading research. In: Gernsbacher M A (ed.) Handbook of Psycholinguistics. Academic Press, San Diego, CA, pp. 1–31 Just M A, Carpenter P A 1980 A theory of reading: From eye fixations to comprehension. Psychological Reiew 85: 329–354 Just M A, Carpenter P A, Woolley J D 1982 Paradigms and processes in reading comprehension. Journal of Experimental Psychology: General 111: 228–38
Kjelgaard M M, Speer S R 1999 Prosodic facilitation and interference in the resolution of temporary syntactic closure ambiguity. Journal of Memory and Language 40: 153–94 Kutas M, Van Petten C K 1994 Psycholinguistics electrified: Event-related brain potential investigations. In: Gernsbacher M A (ed.) Handbook of Psycholinguistics. Academic Press, San Diego, CA, pp. 83–143 MacDonald M C, Pearlmutter N J, Seidenberg M S 1994 Lexical nature of syntactic ambiguity resolution. Psychological Reiew 101: 676–703 Marslen-Wilson W D 1973 Linguistic structure and speech shadowing at very short latencies. Nature 244: 522–3 McKoon G, Ratcliff R, Ward G 1994 Testing theories of language processing: An empirical investigation of the on-line lexical decision task. Journal of Experimental Psychology: Learning, Memory, and Cognition 20: 1219–28 Miller G A 1962 Some psychological studies of grammar. American Psychologist 17: 748–62 Mitchell D C 1994 Sentence parsing. In: Gernsbacher M A (ed.) Handbook of Psycholinguistics. Academic Press, San Diego, CA, pp. 375–410 Rayner K, Sereno S, Morris R, Schmauder R, Clifton Jr C 1989 Eye movements and on-line language comprehension processes. Language and Cognitie Processes 4: SI 21–50 Schafer A, Carter J, Clifton Jr C, Frazier L 1996 Focus in relative clause construal. Language and Cognitie Processes 11: 135–63 Spivey-Knowlton M, Sedivy J C 1995 Resolving attachment ambiguities with multiple constraints. Cognition 55: 227–67 Tanenhaus M K, Trueswell J C 1995 Sentence comprehension. In: Miller J, Eimas P (eds.) Handbook of Perception and Cognition: Speech, Language, and Communication. Academic Press, San Diego, CA, Vol. 11 pp. 217–62
C. Clifton, Jr.
Sequential Decision Making Sequential decision making describes a situation where the decision maker (DM) makes successive observations of a process before a final decision is made, in contrast to dynamic decision making (see Dynamic Decision Making) which is more concerned with controlling a process over time. Formally a sequential decision problem is defined, such that the DM can take observations X , X ,… one " DM # at a time. After each observation Xn the can decide to terminate the process and make a final decision from a set of decisions D, or continue the process and take the next observation Xn+ . If the observations X , X ,… form a random "sample, # sequential sampling. the procedure is"called In most sequential decision problems there is an implicit or explicit cost associated with each observation. The procedure to decide when to stop taking observations and when to continue is called the stopping rule. The objective in sequential decision making is to find a stopping rule that optimizes the 13917
Sequential Decision Making decision in terms of minimizing losses or maximizing gains including observation costs. The optimal stopping rule is also called the optimal strategy or the optimal policy. A wide variety of sequential decision problems have been discussed in the statistics literature, including search problems, inventory problems, gambling problems, and secretary-type problems, including sampling with and without recall. Several methods have been proposed to solve the optimization problem under specified conditions, including dynamic programming, Markov chains, and Bayesian analysis. In the psychological literature, sequential decision problems are better known as optional stopping problems. One line of research using sequential decision making is concerned with seeking information in situations such as buying houses, searching for a job candidate, price searching, or target search. The DM continues taking observations until a decision criterion for acceptance is reached. Another line of research applies sequential decision making to account for information processing in binary choice tasks (see Diffusion and Random Walk Processes; Stochastic Dynamic Models (Choice, Response, and Time)), and hypothesis testing such as in signal detection tasks (see Signal Detection Theory: Multidimensional). The DM continues taking observations until either of two decision criteria is reached. Depending on the particular research area, observations are also called offers, options, items, applicants, information, and the like. Observation costs include explicitly not only possibly money, but also time, effort, aggravation, discomfort, and so on. Contrary to the objective of statisticians or economists, psychologists are less interested in determining the optimal stopping rule, and more interested in discussing the variables that affect human decision behavior in sequential decision tasks. Optimal decision strategies are considered as normative models, and their predictions are compared to actual choice behavior.
1. Sequential Decision Making with One Decision Criterion In sequential decision making with one decision criterion the DM takes costly observations Xn, n l 1,… of a random process one at a time. After observing Xn l xn the DM has to decide whether to continue sampling observations or to stop. In the former case, the observation Xn+ is taken at a cost of cn+ ; in the " " latter case the DM receives a net payoff that consists of the payoff minus the observation costs. The DM’s objective is to find a stopping rule that maximizes the expected net payoff. The optimal stopping rule depends on the specific assumptions made about the situation: (a) the dis13918
tribution of X is known, not known or partly known, (b) Xi are distributed identically for all i, or have the similar distribution but with different parameters, or have different distributions, (c) the number of possible observations, n, is bounded or unbounded, (d) the sampling procedure, e.g., it is possible to take the highest value observed so far when stopping (sampling with recall) or only to take the last value when stopping (sampling without recall), and (e) the cost function, cn, is fixed for each observation or is a function of n. Many of these problems have been studied theoretically by mathematicians and experimentally by psychologists. Pioneering experimental work was done in a series of papers by Rapoport and colleagues (1966, 1969, 1970, 1972).
1.1 Unknown Sample Distribution: Secretary-type Problems Kahan et al. (1967) investigated decision behavior in a sequential search task where the DM had to find the largest of a set of n l 200 numbers, observed one at a time from a deck of cards. The observations were taken in random order without replacement. The DM could only declare the current observation as the largest number (sampling without recall), and could compare the number with the previous presented numbers. No explicit cost for each observation was taken, i.e., c l 0. The sample distribution was unknown to the DM. A reward was paid only when the card with the highest number was selected, and nothing otherwise. This describes a decision situation that is known as the secretary problem (a job candidate search problem; for various other names, see, e.g., Freeman 1983) which, in its simplest form, makes explicit the following assumptions (Ferguson 1989): (a) only one position is available, (b) the number n of applicants is known, (c) applicants are interviewed sequentially in random order, each order being equally likely, and (d) all applicants can be ranked without ties—the decision to reject or accept an applicant must be based only on the relative ranks of the applicants interviewed so far, (e) an applicant once rejected cannot later be recalled, and (f ) the payoff is 1 when choosing the best of the n applicants, 0 otherwise. The optimal strategy for this kind of problem is to reject the first sk1, s 1, items (cards, applicants, draws) and then choose the first item that is best in the relative ranking so far. With 1 1 1 as j j(j s sj1 nk1
(1)
the optimal strategy is to stop if as 1 and to continue if as 1, which can easily be determined for small n. For large n, the probability of choosing the best item is
Sequential Decision Making approximated by 1\e and the optimal s by n\e. (e l 2.71…). (For derivations, see, e.g., DeGroot 1970, Freeman 1983, Gilbert and Mosteller 1966). Kahan et al. (1967) reported that about 40 percent of their subjects did not follow the optimal strategy but stopped too late and rejected a card that should have been accepted. The failure of the strategy for describing behavior was assigned to its inadequacy for the described task. Although at the beginning of the experiment the participants did not know anything about the distribution (requirement), they could learn about the distribution by taking observations (partly information). To guarantee ignorance of the distribution, Gilbert and Mosteller (1966) recommended supplying only the rank of the observation made so far and not the actual value. Seale and Rapoport (1997) conducted an experiment following this advice. They found that participants (with n l 40 and n l 80) stopped earlier than prescribed by the optimal stopping rule. They proposed simple decision rules or heuristics to describe the actual choice behavior. Using a cutoff rule, the DM rejects the first sk1 applicants and then chooses the next top-ranked applicant, i.e., the candidate. The DM simply counts the number of applicants and then stops on the first candidate after observing sk1 applicants. Under a candidate count rule, the DM counts the number of candidates and chooses the j th candidate. A successive non-candidate rule requires the DM to choose the first candidate after observing at least k consecutive noncandidates following the last candidate. The secretary problem has been extended and generalized in many different directions within the mathematical statistics field. Each of the above assumptions has been relaxed in one way or another. (Ferguson 1989, Freeman 1983). However, the label of secretary problem tends to be used only when the distribution is unknown and the decision to stop or to continue depends only on the relative ranking of the observations taken so far and not on their actual values.
1.2 Known Sample Distribution Rapoport and Tversky (1966, 1970) investigated choice behavior when the mean and the variance of the distribution was known to the DM. The cost for each observation was fixed but the amount varied across experimental conditions, and the number of possible observations n was unbounded (1966) or bounded and known (1970). Behavior for sampling with and without recall was compared. When sampling is without recall only the value of the last observation, Xn l xn, can be received, and the payoff is this value minus the total sampling cost, i.e, xnkcn. The optimal strategy is to find a stopping rule that maximizes the expected payoff E(XNkcN). When sampling is with recall, the highest value observed so far can be selected and the
payoff is max(x ,…, xn)kcn and the optimal strategy is to find a" stopping rule that maximizes E(max(X ,…, XN)kcN). In the following, with subscripts" and * denote the expected gain from an (optimal) procedure.
1.2.1 Number of Obserations Unbounded. If n is unbounded, i.e., if the number of observations that can be taken is unlimited, and X , X … are sampled from " F(x), # the optimal strata known distribution function egy is the same for both sampling with and without recall. In particular, the optimal strategy is to continue to take observations whenever the observed value xj *, and to stop taking observations as soon as an observed value xj *, where * is the unique solution of
&
_
(xk*) d F (x) l c k_
*
_.
(2)
v*
When the observations are taken from a standard normal distribution with density functions φ(x) and distribution function Φ(x), we have that * l
φ(*)kc . 1kΦ(*)
(3)
Although sampling with and without recall have the same solution, they seem to be different from a psychological point of view. Rapoport and Tversky (1966) found that the group sampling without recall took significantly fewer observations than the participants sampling with recall. The mean number of observations for both groups decreased with increasing cost c, and the difference with respect to the number of observations taken was diminished. However, the participants in both groups took fewer observations than prescribed by the optimal strategy. This nonoptimal behavior of the participants was attributed to a lack of thorough knowledge of the distributions. 1.2.2 Number of Obserations Bounded. If n, n 2, is bounded, i.e., if not more than n observations can be taken, the optimal stopping rules for sampling with and without recall are different. For sampling without recall, an optimal procedure is to continue taking observations whenever xj n−jkc and to stop as soon as xj n−jkc, where j l 1, 2… n indicates the number of observations which remain available and j+ l (jkc)k "
&
_
vj−c
(xk(jkc)) d F (x).
(4) 13919
Sequential Decision Making With l E (X)kc, the sequence can be computed " successively. Again, assuming a standard normal distribution j+ l φ(jkc)jΦ(jkc). For sampling" with recall, the optimal strategy is to continue the process whenever a value xj * and to stop taking observations as soon as an observed value xj *, where * is as in Eqn. (2), which is the same solution as for n unbounded. (For derivations of the strategies, see DeGroot 1970, Sakaguchi 1961.) Rapoport and Tversky (1970) investigated choice behavior within this scenario. Sampling was done both with and without recall. The number of observations that could be taken as well as observation cost varied across experimental groups. One third of the participants did not follow the optimal strategy. Under both sampling procedures and all cost conditions, they took on average fewer observations than predicted by the corresponding optimal stopping rules. There were no systematic differences due to cost, as observed in their previous study. They concluded that ‘the optimal model provides a reasonable good account of the behavior of the subjects’ (p. 119).
1.3 Different Sample Distributions for Each Obseration Most research concerned with sequential decision making assumes that the observations are sampled from the same distribution, i.e., Xi are distributed identically for all i. For many decision situations, however, the observations may be sampled from the same distribution family with different parameters, or from different distributions. Especially in economic areas, such as price search, it is reasonable to assume that the distributions from which observations are taken change over time. The sequence of those samples has been called nonstationary series. Of particular interest are two special nonstationary series: ascending and descending series. For ascending series, the observations are drawn from distributions, usually from normal distributions, with increasing mean as i increases; for descending series the mean of the distribution decreases as i increases, i indicating the sample index. For both cases, experiments have been conducted to investigate choice behavior in a changing environment. Shapira and Venezia (1981) compared choice behavior for ascending, descending and constant (identically distributed Xi) series. In one experiment (numbers from a deck of cards), the distributions were known to the DM; no explicit observation costs were imposed; sampling occurred without recall; and the number of observations that could be taken was limited to n l 7. The variance of the distributions varied across experimental groups. An optimal procedure was assumed to continue taking observations whenever xj n−j, and to stop as soon as xj n−j, where j l 1, 2… n indicates the number of 13920
observations which remain available. k l 1,…, n indicates the specific distribution for the jth observation. Thus
&
j+ l jk "
_
vj
(xkj) dFk(x).
(5)
With l E(X ) the sequence can be computed succes" sively. " Assuming a standard normal distribution j+ l φk(j)jΦk(j). "Across all conditions, 58 percent of the participants behaved in an optimal way. The proportion of optimal stopping did not depend on the type of series but on the size of the variance. Nonoptimal stopping (24 percent stopped too early; 18 percent too late) depended on the series and on the size of the variance. In particular, participants stopped too early on ascending and too late on descending series. A similar result was observed by Brickman (1972). In this study, departing from the optimal stopping rule was attributed to an inadequacy of the stopping rule taken for the particular experimental conditions (assuming complete knowledge of the distributions). In a secretary problem design (see Sect. 1.1), Corbin et al. (1975) were less concerned with optimal choice behavior than with the processes by which the participants made their selections, and with factors that influenced those processes. The emphasis of the investigation was on decision making heuristics rather than on the adequacy of optimal models. With the same optimal stopping rule for all experimental conditions, they found that stopping behavior depended on contextual variables such as the ascending or descending trend of the inspected numbers of the stack.
2. Search Problems—Multiple Information Sources In a sequential decision making task with multiple information sources, the DM has the option to take information sequentially from different sources. Each information source may provide valid information with a particular probability and at different cost. The task is not only to decide to stop or to continue the process but also, if continuing, which source of information to consult. Early experimental studies were done by Kanarick et al. (1969), Rapoport (1969), Rapoport et al. (1972). A typical task is to find an object (e.g., a black ball) which is hidden in one of several possible locations (e.g., in one of several bins containing white balls). The optimal search strategy depends on further task specifications, such as whether the object can move from one location to another, how many objects are to be found, and whether the search process may stop before the object has been found. Rapoport (1969)
Sequential Decision Making investigated the case when a single object that could not move was to be found in one of r, r 2, possible locations. The DM was not allowed to stop the process before the target was found. All of the following were known to the DM: the a priori probability pi, pi 0 that the object is in location i, i l 1, 2,…, r, with ri = pi l 1; a miss probability αi, 0 αi 1, that even if" the object is in location i it will not be found in a particular search of that location (1kαi is referred to the respective detection probability); and a cost, ci, for a single observation at location i. The objective of the DM is to find a search strategy that minimizes the expected cost. For i l 1,…, r and j l 1, 2,… let Πij denote the probability that the object is found at location i during the jth search and the search is terminated. Then " (1kα ), Πij l piα j− i i
i l 1,…, r, j l 1, 2,…
(6)
If all values of Πij\ci for all values of i and j are arranged in order of decreasing magnitude, the optimal strategy is to search according to this ordering (for derivations, see DeGroot 1970). Ties may be ordered arbitrarily among themselves. The optimal strategy is determined by the detection probabilities and observation costs, and optimal search behavior implies a balance between maximizing the detection probability and minimizing the observation cost. Rapoport (1969) found that participants did not behave optimally. They were more concerned with maximizing the probability of detecting the target than with minimizing observation cost. Increasing the difference of observation cost ci among the i l 1, 2, 3, 4 locations showed that the deviation from the optimal strategy even increased. Rapoport et al. (1972) varied the search problems by allowing the DM to terminate the search at any time; adding a terminal reward, R, for finding the target; and a terminal penalty, B, for not finding the target. Most participants showed a bias toward maximizing detection probability vs. minimizing search cost per observation, similar to the previous study.
3. Sequential Decision Making with Two or More Possible Decisions A random sample X , X , … is generated by an " # Θ. The DM can take unknown state of nature, observations one at a time. After observing Xn l xn the DM makes inferences about Θ based on the values of Xi, …, Xn and can decide whether to continue sampling observations or to stop the process. In the former case, observation Xn+ is taken; in the latter, the DM makes a final decision d "? D. The consequences to the DM depend on the decision d and the value θ.
The statistical theory for this situation was developed by Wald during the 1940s. It has been used to test hypotheses and estimate parameters. In psychological research, sequential decision making of this kind is usually limited to two decisions D l od , d q, # and applied to binary choice tasks (see Diffusion" and Random Walk Processes; Stochastic Dynamic Models (Choice, Response, and Time); Bayesian Theory: History of Applications). The standard theory of sequential analysis by Wald (1947) does not include considerations of observation costs C(n), losses for terminal decisions L(θ, d ), and a priori (subjective) probabilities π of the alternative states of nature. Deferred decision theory generalizes the original theory by including these variables explicitly. The objective of the DM is to find a stopping rule that minimizes expected loss (called risk) and expected observation cost. The form of that optimal stopping rule depends mainly on the assumptions about the number of observations that can be taken (bounded or unbounded), and on the assumption of cost per observation (fixed or not) (see DeGroot 1970). Birdsall and Roberts (1965), Edwards (1965), and Rapoport and Burkheimer (1971) introduced the idea of deferred decision theory as normative models of choice behavior to the psychological community. Experiments investigating human behavior in deferred decision tasks have been carried out by Pitz and colleagues (e.g., Pitz et al. 1969), and by Busemeyer and Rapoport (1988). Rapoport and Wallsten (1972) summarize experimental findings. For illustration, assume the decision problem in its simplest form. Suppose two possible states of nature θ or θ , and two possible decisions d and d . Cost c per" # " of observations # observation is fixed and the number is unbounded. The DM does not know which of the states of nature, θ or θ is generating the observation, " # but there are a priori probabilities π that it is θ and " (1kπ) that it is θ . Let wi denote the loss for a terminal # decision incurred by the DM in deciding that θi is not the correct state of nature when it actually is (i l 1, 2). No losses are assumed when the DM makes a correct decision. Let πn denote the posterior probability that θ is the correct state of nature generating the " observations after n observations have been made. The total posterior expected loss is rn l minow πn, w (1kπn)qjnc. The DM’s objective is to " the# expected loss. An optimal stopping rule minimize is specified in terms of decision boundaries, α and β. If the posterior probability is greater than or equal to α, then decision d is made; if the posterior probability is smaller than or"equal to β, then d is selected; otherwise # sampling continues. See also: Decision Making (Naturalistic), Psychology of; Decision Making, Psychology of; Decision Research: Behavioral; Dynamic Decision Making; Multi-attribute Decision Making in Urban Studies 13921
Sequential Decision Making
Sequential Statistical Methods
Bibliography Birdsall T G, Roberts R A 1965 Theory of signal detectability: Deferred decision theory. The Journal of Acoustical Society of America 37: 1064–74 Brickman P 1972 Optional stopping on ascending and descending series. Organizational Behaior and Human Performance 7: 53–62 Busemeyer J R, Rapoport A 1988 Psychological models of deferred decision making. Journal of Mathematical Psychology 32(2): 91–133 Corbin R M, Olson C L, Abbondanza M 1975 Context effects in optional stopping decisions. Organizational Behaior and Human Performance 14: 207–16 De Groot M H 1970 Optimal Statistical Decisions. McGrawHill, New York Edwards W 1965 Optimal strategies for seeking information: Models for statistics, choice response times, and human information processing. Journal of Mathematical Psychology 2: 312–29 Ferguson T S 1989 Who solved the secretary problem? Statistical Science 4(3): 282–96 Freeman P R 1983 The secretary problem and its extensions: A review. International Statistical Reiew 51: 189–206 Gilbert J P, Mosteller F 1966 Recognizing the maximum of a sequence. Journal of the American Statistical Association 61: 35–73 Kahan J P, Rapoport A, Jones L E 1967 Decision making in a sequential search task. Perception & Psychophysics 2(8): 374–6 Kanarick A F, Huntington J M, Peterson R C 1969 Multisource information acquisition with optional stopping. Human Factors 11: 379–85 Pitz G F, Reinhold H, Geller E S 1969 Strategies of information seeking in deferred decision making. Organizational Behaior and Human Performance 4: 1–19 Rapoport A 1969 Effects of observation cost on sequential search behavior. Perception & Psychophysics 6(4): 234–40 Rapoport A, Tversky A 1966 Cost and accessibility of offers as determinants of optional stopping. Psychonomic Science 4: 45–6 Rapoport A, Tversky A 1970 Choice behavior in an optional stopping task. Organizational Behaior and Human Performance 5: 105–20 Rapoport A, Burkheimer G J 1971 Models of deferred decision making. Journal of Mathematical Psychology 8: 508–38 Rapoport A, Lissitz R W, McAllister H A 1972 Search behavior with and without optional stopping. Organizational Behaior and Human Performance 7: 1–17 Rapoport A, Wallsten T S 1972 Individual decision behavior. Annual Reiew of Psychology 23: 131–76 Sakaguchi M 1961 Dynamic programming of some sequential sampling design. Journal of Mathematical Analysis and Applications 2: 446–66 Seale D A, Rapoport A 1997 Sequential decision making with relative ranks: An experimental investigation of the ‘secretary problem.’ Organizational Behaior and Human Decision Processes 69(3): 221–36 Shapira Z, Venezia I 1981 Optional stopping on nonstationary series. Organizational Behaior and Human Performance 27: 32–49 Wald A 1947 Sequential Analysis. Wiley, New York
A. Diederich 13922
Statistics plays two fundamental roles in empirical research. One is in determining the data collection process: the experimental design. The other is in analyzing the data once it has been collected. For the purposes of this article, two types of experimental designs are distinguished: sequential and nonsequential. In a sequential design the data that accrue in an experiment can affect the future course of the experiment. For example, an observation made on one experimental unit treated in a particular way may determine the treatment used for the next experimental unit. The term ‘adaptive’ is commonly used as an alternative to sequential. In a nonsequential design the investigator can carry out the entire experiment without knowing any of the interim results. The distinction between sequential and nonsequential is murky. An investigator’s ability to carry out an experiment exactly as planned is uncertain, as information that becomes available from within and outside the experiment may lead the investigator to amend the design. In addition, a nonsequential experiment may give results that encourage the investigator to run a second experiment, one that might even simply be a continuation of the first. Considered separately, both experiments are nonsequential, but the larger experiment that consists of the two separate experiments is sequential. In a typical type of nonsequential design, 20 patients suffering from depression are administered a drug and their improvements are assessed. An example sequential variation is the following. Patients’ improvements are recorded ‘in sequence’ during the experiment. The experiment stops should it happen that at least nine of the first 10 patients, or no more than one of the first 10 patients improve(s). On the other hand, if between two and eight of the first 10 patients improve then sampling continues to a second set of 10 patients, making the total sample size equal to 20 in that case. Another type of sequential variation is when the dose of the drug is increased for the second 10 patients should it happen that fewer than four of the first 10 improve. Much more complicated sequential designs are possible. For example, the first patient may be assigned a dose in the middle of a range of possible doses. If the patient improves then the next patient is assigned the next lower dose, and if the first patient does not improve then the next patient is assigned the next higher dose. This process continues, always dropping the dosage if the immediately preceding patient improved, and increasing the dosage if the immediately preceding patient did not improve. This is called an ‘up-and-down’ design. Procedures in which batches of experimental units (such as groups of 10 patients each) are analyzed before proceeding to the next stage of the experiment
Sequential Statistical Methods are called ‘group-sequential.’ Designs such as the upand-down design, in which the course of the experiment can change after each experimental unit responds are called ‘fully sequential.’ So a fully sequential design is a group-sequential design in which the group size is one. Designs in which the decision of when to stop the experiment depends on the accumulating results are called ‘sequential stopping.’ Using rules to determine which treatments to assign to the next experimental unit or batch of units is called ‘sequential allocation.’ Designs of most scientific experiments are sequential, although perhaps not formally so. Investigators usually want to conserve time and resources. In particular, they do not want to continue an experiment if they have already learned what they set out to learn, and this is so whether their conclusion is positive or negative, or if finding a conclusive answer would be prohibitively expensive. (Where the investigator discovers that the standard deviation of the observations is much larger than originally thought is an example of an experimental design that would be prohibitively expensive to continue because the required sample size would be large.) Sequential designs are difficult or impossible to use in some investigations. For example, results might take a long time to obtain, and waiting for them would mean delaying other aspects of the experiment. Suppose one is interested in whether grade-schoolers diagnosed with attention deficit hyperactivity disorder (ADHD) should be prescribed Ritalin. The outcome of interest is whether children on Ritalin will be addicted to drugs as adults. Consider assigning groups of 10 children to Ritalin and 10 to a placebo, and waiting to observe their outcomes before deciding whether to assign an additional group of 10 patients to each treatment. The delay in observation means that it would probably take hundreds of years to get an answer to the overall question. The long-term nature of the endpoint means that any reasonable experiment addressing this question would necessarily be nonsequential, with large numbers of children assigned to the two groups before any information at all would become available about the endpoint.
1. Analyzing Data from Sequential Experiments—Frequentist Case Consider an experiment of a particular type, say one to assess extrasensory perception (ESP) ability. A subject claiming to have ESP is asked to choose between two colors. The null hypothesis of no ability is that the subject is only guessing, in which case the correct color has a probability of 1\2. Suppose the subject gets 13 correct out of 17 tries. How should these results be analyzed and reported? The answer depends on one’s statistical philosophy. Frequentists and Bayesians
take different approaches. Frequentist analyses depend on whether the experiment’s design is sequential, and if it is sequential the conclusions will differ depending on the actual design used. In the nonsequential case the subject is given exactly 17 tries. The frequentist P-value is the probability of results as extreme as or more extreme than those observed. The results are said to be ‘statistically significant’ if the P-value is less than 5 percent. A convention is to include both 13 or more successes and 13 or more failures. (This ‘two-sided’ case allows for the possibility that the subject has ESP but has inverted the ‘extrasensory’ signals.) Assuming the null hypothesis and that the tries are independent, the probabilities of the number of successes is binomial. Binomial probabilities can be approximated using the normal distribution. The z-score for 13 out of 17 is about 2, and so the probability of 13 or more successes or 13 or more failures is about 0.05 (the exact binomial probability is 0.049), and so the results are statistically significant at the 5 percent level and the null hypothesis is rejected. Now suppose the experiment is sequential. The frequentist significance level is now different, and it depends on the actual design used. Suppose the design is to sample until the subject gets at least four successes and at least four failures—same data, different design. Again, more extreme means more than 13 successes (and exactly four failures) or more than 13 failures (and exactly four successes). The total probability of these extreme values is 0.021—less than 0.049—and so the results are now more highly significant than if the experiment’s design had been nonsequential. Consider another sequential design, one of a type of group-sequential designs commonly used in clinical trials. The experimental plan is to stop at 17 tries if 13 or more are successes or 13 or more are failures, and hence the experiment is stopped on target. But if after 17 tries the number of successes is between five and 12 then the experiment continues to a total of 44 tries. If at that time, 29 or more are successes or 29 or more are failures then the null hypothesis is rejected. To set the context, suppose the experiment is nonsequential, with sample size fixed at 44 and no possibility of stopping at 17; then the exact significance level is again 0.049. When using a sequential design, one must consider all possible ways of rejecting the null hypothesis in calculating a significance level. In the group-sequential design there are more ways to reject than in the nonsequential design with the sample size fixed at 17 (or fixed at 44). The overall probability of rejecting is greater than 0.049 but is somewhat less than 0.049j0.049 because some sample paths that reject the null hypothesis at sample size 17 also reject it at sample size 44. The total probability of rejecting the null hypothesis for this design is actually 0.080. Therefore, even though the results beyond the first 17 observations are never observed, the fact that they might have been observed makes 13 successes of 17 no 13923
Sequential Statistical Methods Table 1 Summary of experimental designs and related significance levels Stopping rule After 17 observations (nonsequential) After at least 4 successes and 4 failures After 17 or 44 observations, depending on interim results Stop when ‘you think you know the answer’
Significance level 0.049 0.021 0.08 Undefined
longer statistically significant (since 0.08 is greater than 0.05). The three designs above are summarized in Table 1. The table includes a fourth design in which the significance level cannot be found. To preserve a 0.05 significance level in groupsequential or fully sequential designs, investigators must adopt more stringent requirements for stopping and rejecting the null hypothesis; that is, they must include fewer observations in the region where the null hypothesis is rejected. For example, the investigator in the above study might drop 13 successes or failures in 17 tries and 29 successes or failures in 44 tries from the rejection region. The investigator would stop and claim significance only if there are at least 14 successes or at least 14 failures in the first 17 tries, and claim significance after 44 tries only if there are at least 30 successes or at least 30 failures. The nominal significance leels (those appropriate had the experiment been nonsequential) at n l 17 and n l 44 are 0.013 and 0.027, and the overall (or adjusted) significance level of rejecting the null hypothesis is 0.032. (No symmetric rejection regions containing more observations allow the significance level to be greater than this but still smaller than 0.05.) With this design, 13 successes out of 17 is not statistically significant (as indicated above) because this data point is not in the rejection region. The above discussion is in the context of significance testing. But the same issues apply in all types of frequentist inferences, including confidence intervals. The implications of the need to modify rejection regions depending on the design of an experiment are profound. In view of the penalties that an investigator pays in significance level that are due to repeated analyses of accumulating data, investigators strive to minimize the number of such analyses. They shy away from using sequential designs and so may miss opportunities to stop or otherwise modify the experiment depending on accumulating results. What happens if investigators fail to reveal that other analyses did occur, or that the experiment might have continued had other results been observed? Any frequentist conclusion that fails to take the other analyses into account is meaningless. Strictly speaking, 13924
this is a breach of scientific ethics when carrying out frequentist analyses. But it is difficult to find fault with investigators who do not understand the subtleties of frequentist reasoning and who fail to make necessary adjustments to their inferences. For more information about the frequentist approach to sequential experimentation, see Whitehead (1992).
2. Analyzing Data from Sequential Experiments—Bayesian Case When taking a Bayesian approach (see Bayesian Statistics) (or a likelihood approach), conclusions are based only on the observed experimental results and do not depend on the experiment’s design. So the murky distinction that exists between sequential and nonsequential designs is irrelevant in a Bayesian approach. In the example considered above, 13 successes out of 17 tries will give rise to the same inference in each of the designs considered. Bayesian conclusions depend only on the data actually observed and not otherwise on the experimental design (Berger and Wolpert 1984, Berry 1987). The Bayesian paradigm is inherently sequential. Bayes’s theorem prescribes the way learning takes place under uncertainty. It specifies how an observation modifies one’s state of knowledge (Berry 1996). Moreover, each observation that is planned has a probability distribution. After 13 successes in 17 tries, the probability of success on the next try can be found. This requires a distribution, called a ‘prior distribution,’ for the probability of success on the first of the 17 tries. Suppose the prior distribution is uniform from zero to one. (This is symmetric about the null hypothesis of 1\2, but it is unlikely to be anyone’s actual prior distribution in the case of ESP because it gives essentially all the probability to some ESP ability.) The predictive probability of a success on the 18th try is then (13j1)\(17j2) l 0.737, called ‘Laplace’s rule of succession’ (Berry 1996, p. 204). Whether to take this 18th observation can be evaluated by weighing the additional knowledge gained (having 14 successes out of 18, with probability 0.737, or 13 successes out of 18, with probability 0.263) with the costs associated with the observation. Predictive probabilities are fundamental in a Bayesian approach to sequential experimentation. They indicate how likely it is that the various possibilities for future data will happen, given the data currently available. Suppose that after 13 successes of 17 tries one is entertaining taking an additional 27 observations. One may be interested in getting at least 30 successes out of the total of 44 observations—which means at least 17 of the additional 27 observations are successes. The predictive probability of this is about 50 percent. Or one may be interested in getting successes
Sequential Statistical Methods in at least 1\2 (22) of the 44 tries. The corresponding predictive probability is 99.5 percent. The ability to use Bayes’s theorem for updating one’s state of knowledge and the use of predictive probabilities makes the Bayesian approach appealing to researchers in the sequential design of experiments. As a consequence, many researchers who prefer a frequentist perspective use the Bayesian approach in the context of sequential experimentation. If they are interested in finding the frequentist operating characteristics (such as significance level and power), these can be calculated by simulation. The next section (Sect. 4) considers a special type of sequential experiment. The goals of the section are to describe some of the calculational issues that arise in solving sequential problems and to convey some of the interesting aspects of sequential problems. It takes a Bayesian perspective.
3. Sequential Allocation of Experiments: Bandit Problems In many types of experiments, including many clinical trials, experimental units are randomized in a balanced fashion to the candidate treatments. The advantage of a balanced design is that it gives maximal information about the differences between treatments. In some types of experiment, including some clinical trials, it may be important to obtain good results on the units that are part of the experiment. Treatments—or arms—are assigned based on accumulating results; that is, assignment is sequential. The goal is to maximize the overall effectiveness—of those units in the experiment, but also, perhaps, including those units not actually in the experiment but whose treatment might benefit from information gained in the experiment. Specifying a design is difficult. The first matter to be considered is the arm selected for the initial unit. Suppose that the first observation is X . The second " next, given component of the design is the arm selected X and also given the first arm selected. The third " component depends on X and the second observation " X , and on the corresponding arms selected. And so # A design is optimal if it maximizes the expected on. number of successes. An arm is optimal if it is the first selection of an optimal design. Consider temporarily an experiment with n units and two available arms. Outcomes are dichotomous: arm 1 has success probability p and arm 2 has success " probability p . The goal is to maximize the expected # number of successes among the n units. Arm 1 is standard and has known success proportion p . Arm 2 has unknown efficacy. Uncertainty about p is"given in # To be terms of a prior probability distribution. specific, suppose that this is uniform on the interval from 0 to 1.
Table 2 Possible designs and associated expected number of successes Design
Expected number of successes
o1; 1, 1q o1; 1, 2q o1; 2, 1q o1; 2, 2q o2; 1, 1q o2; 1, 2q o2; 2, 1q o2; 2, 2q
2p " p jp#j(1kp )\2 " " " p jp \2j(1kp )p " " " " p j1\2 " 1\2jp " 1\2j(1\2)p j(1\2)(1\3) " 1/2j(1/2)(2/3)j(1/2)p " 2(1/2) l 1
If n l 1 then the design requires only an initial selection, arm 1 or arm 2. Choosing arm 1 has expected number of successes p . Choosing arm 2 has con" ditional expected number of successes p , and an unconditional expected number of successes,# the prior probability of success, which is 1\2. Therefore arm 1 is optimal if p 1\2 and arm 2 is optimal if p 1\2. " (Both arms—and any randomization between"them— are optimal when p l 1\2.) " The problem is more complicated for n 2. Consider n l 2. There are two initial choices and two choices depending on the result of the first observation. There are eight possible designs. One can write a design as oa; aS, aFq, where a is the initial selection, aS is the next selection should the first observation be a success, and aF is the next selection should the first observation be a failure. To find the expected number of successes for a particular design, one needs to know such quantities as the probability of a success on arm 2 after a success on arm 2 (which is 2\3) and the probability of a success on arm 2 after a failure on arm 2 (which is 1\3). The possible designs and their associated expected numbers of successes are given in Table 2. It is easy to check that only three of these expected numbers of successes (shown in bold) are candidates for the maximum. If p 5\9 then o1; 1, 1q is optimal; " o2; 2, 1q is optimal; and if if 1\3 p 5\9 then " o2; 2, 2q is optimal. For example, if p l p 1\3 then " then it is optimal to use the unknown arm" 2 1\2 initially. If the outcome is a success, then a decision is made to ‘stay with a winner’ and use arm 2 again. If a failure occurs, then the decision is made to switch to the known arm 1. Enumeration of designs is tedious for large n. Most designs can be dropped from consideration based on theoretical results (Berry and Fristedt 1985). For example, there is a breakeven value of p , say p, such " that arm 1 is optimal for p p. Also, one" need " " consider only those designs that continue to use arm 1 once it has been selected. But many designs remain. Backward induction can be used to find an optimal design (Berry and Fristedt 1985). 13925
Sequential Statistical Methods Table 3 The optimal expected proportion of successes for selected values of n and for fixed p l 1\2 " n 1 2 5 10 20 50 100 200 500 Proportion of successes
0.500
0.542
0.570
0.582
0.596
Table 4 The breakeven values of p * for selected values of n " n 1 2 5 10 20 p* "
0.500
0.556
0.636
0.698
0.758
Table 3 gives the optimal expected proportion of successes for selected values of n and for fixed p l " 1\2. Asymptotically, for large n, the maximal expected proportion of successes is 5\8, which is the expected value of the maximum of p and p . Both arms offer the # same chance of success on" the current unit, but only arm 2 gives information that can help in choosing between the arms for treating later units. Table 4 gives the breakeven values of p for selected " values of n. This table shows that information is more important for larger n. For example, if p l 0.75 then arm 1 would be optimal for n l 10, but" it would be advisable to test arm 2 when n l 100; this is so even though arm 1 has probability of 0.75 of being better than arm 2. When there are several arms with unknown characteristics, the problem is still more complicated. Optimal designs may well indicate selection of an arm that was used previously and set aside in favor of another arm because of inadequate performance. For the methods and theory for solving such problems, see Berry (1972), and Berry and Fristedt (1985). The optimal designs are generally difficult to describe. Berry (1978) provides easy to use sequential designs that are not optimal but that perform reasonably well. Suppose the n units in the experiment are a subset of the N units on which arms 1 and 2 can be applied. Berry and Eick (1995) consider the case of two arms with dichotomous response and show how to incorporate all N units into the design problem. They find the optimal Bayes design when p and p have " They# comindependent uniform prior distributions. pare this with various other sequential designs and with a particular nonsequential design: balanced randomization to arms 1 and 2. The Bayes design performs best on average, of course, but it is robust in the sense that it outperforms the other designs for essentially all pairs of p and p . " #
4. Further Reading The pioneers in sequential statistical methods were Wald (1947) and Barnard (1944). They put forth the sequential probability ratio test (SPRT), which is of 13926
0.607
0.613
0.617
0.621
1,000
10,000
0.622
0.624
50
100
200
500
1000
10,000
0.826
0.869
0.902
0.935
0.954
0.985
fundamental importance in sequential stopping problems. The study of the SPRT dominated the theory and methodology of sequential experimentation for decades. For further reading about Bayesian vs. frequentist issues in sequential design, see Berger (1986), Berger and Berry (1988), Berger and Wolpert (1984), and Berry (1987, 1993). For further reading about the frequentist perspective, see Chow et al. (1971) and Whitehead (1992). For further reading about Bayesian design issues, see Berry (1993), Berry and Stangl (1996), Chernoff and Ray (1965), Cornfield (1966), and Lindley and Barnett (1965). For further reading about bandit problems, see Berry (1972, 1978), Berry and Eick (1995), Berry and Fristedt (1985), Bradt et al. (1956), Friedman et al. (1964), Murphy (1965), Rapoport (1967), Rothschild (1974), Viscusi (1979), and Whittle (1982\3). There is a journal called Sequential Analysis that is dedicated to the subject of this article. See also: Clinical Treatment Outcome Research: Control and Comparison Groups; Experimental Design: Overview; Experimental Design: Randomization and Social Experiments; Psychological Treatments: Randomized Controlled Clinical Trials; Quasi-Experimental Designs
Bibliography Barnard G A 1944 Statistical Methods and Quality Control, Report No. QC\R\7. British Ministry of Supply, London Berger J O 1986 Statistical Decision Theory and Bayesian Analysis, 2nd edn. Springer, New York Berger J O, Berry D A 1988 Statistical analysis and the illusion of objectivity. American Scientist 76: 159–65 Berger J O, Wolpert R L 1984 The Likelihood Principle. Institute of Mathematical Statistics, Hayward, CA Berry D A 1972 A Bernoulli two-armed bandit. Annals of Mathematical Statistics 43: 871–97 Berry D A 1978 Modified two-armed bandit strategies for certain clinical trials. Journal of the American Statistical Association 73: 339–45
Serial Verb Constructions Berry D A 1987 Interim analysis in clinical trials: The role of the likelihood principle. American Statistician 41: 117–22 Berry D A 1993 A case for Bayesianism in clinical trials (with discussion). Statistics in Medicine 12: 1377–404 Berry D A 1996 Statistics: A Bayesian Perspectie. Duxbury Press, Belmont, CA Berry D A, Eick S G 1995 Adaptive assignment versus balanced randomization in clinical trials: A decision analysis. Statistics in Medicine 14: 231–46 Berry D A, Fristedt B 1985 Bandit Problems: Sequential Allocation of Experiments. Chapman and Hall, London Berry D A, Stangl D K 1996 Bayesian methods in health-related research. In: Berry D A, Stangl D K (eds.) Bayesian Biostatistics. Marcel Dekker, New York, pp. 1–66 Bradt R N, Johnson S M, Karlin S 1956 On sequential designs for maximizing the sum of n observations. Annals of Mathematical Statistics 27: 1060–70 Chernoff H, Ray S N 1965 A Bayes sequential sampling inspection plan. Annals of Mathematical Statistics 36: 1387–407 Chow Y S, Robbins H, Siegmund D 1971 Great Expectations. Houghton Mifflin, Boston Cornfield J 1966 Sequential trials, sequential analysis and the likelihood principle. American Statistician 20: 18–23 Friedman M P, Padilla G, Gelfand H 1964 The learning of choices between bets. Journal of Mathematical Psychology 1: 375–85 Lindley D V, Barnett B N 1965 Sequential sampling: Two decision problems with linear losses for binomial and normal random variables. Biometrika 52: 507–32 Murphy R E Jr. 1965 Adaptie Processes in Economic Systems. Academic Press, New York Rapoport A 1967 Dynamic programming models for multistage decision making tasks. Journal of Mathematical Psychology 4: 48–71 Rothschild M 1974 A two-armed bandit theory of market pricing. Journal of Economic Theory 9: 185–202 Viscusi W K 1979 Employment Hazards: An Inestigation of Market Performance. Harvard University Press, Cambridge, MA Wald A 1947 Sequential Analysis. Wiley, New York Whitehead J 1992 The Design and Analysis of Sequential Clinical Trials. Horwood, Chichester, UK Whittle P 1982\3 Optimization Oer Time. Wiley, New York, Vols. 1 and 2
D. A. Berry
Serial Verb Constructions
scholars would regard as an SVC: (1) Kofi de pono no baae Kofi take-PAST table the come-PAST ‘Kofi brought the table’ or more literally, ‘Kofi took the table and came with it’ SVCs have attracted attention from linguists for three main reasons. First, syntacticians have found them interesting because the possibility of having more than one main (i.e., nondependent, nonauxiliary) verb within a clause or clause-like unit challenges traditional assumptions that a clause contains exactly one predicate (see Foley and Olson 1985, pp. 17, 57). Second, because of their prevalence both in languages of West Africa and in certain Creole languages, they have become central to debates among Creole language researchers, being seen either as evidence for the importance of substrate (i.e., African mother-tongue) influence on the Creole, or as deriving from universal properties of human language which are manifested in the creolization process (see Pidgin and Creole Languages. Third, serial verbs show a tendency to be reanalyzed as other grammatical categories (complementizers and prepositions), with implications for the processes of historical change (see Historical Linguistics: Oeriew). While different researchers use different criteria for defining serial verbs, the following set would probably be accepted by most and is based on one described by McWhorter (1997, p. 22) as a ‘relatively uncontroversial distillation of the conclusions of various scholars’: (2) (a) an SVC contains only one overt subject; (b) it contains no overt markers of coordination or subordination; (c) it expresses concomitant actions (either simultaneous or consecutive) or a single action; (d) it falls under one intonational contour; (e) it has tense–modality–aspect marking on none of the verbs, or on one only (usually the first or last in the series) or on all of them; and (f ) it contains a second verb which is not obligatorily subcategorized for by the first verb. Even the above six conditions leave it unclear where exactly to draw the line around SVCs, particularly in languages which have little morphology. In some cases of what may appear to be SVCs, we may be dealing simply with unmarked coordination of verb phrases expressing simultaneous actions.
1. Definition and Importance The term ‘serial verb construction’ (SVC) is usually applied to a range of apparently similar syntactic constructions in different languages, in which several verbs occur together within one clause or unit, without evidence of either subordination or coordination of the verbs. For example, sentences like (1), from the West African language Twi, are typical of what most
2. Serial Verb Types Serial constructions can be classified into different types on the basis of the functions of the verbs in the series relative to one another. 13927
Serial Verb Constructions The sentences in (3) provide examples of different functional types of SVC which have been discussed in the literature. (3) (a) Directional complements: V is go, come, or # verb which another common intransitive motion functions to indicate the directionality of the action denoted by V . Olu! gbe! a"' ga wa! Olu take chair come ‘Olu brought a chair’ (Yoruba, West African) (b) Other motion erb complements: similar to type (a), but V may be transitive motion verb with its own # object expressing the goal of the action denoted by V Kofi tow agyan no wuu Amma " Kofi throw-PAST arrow the pierce-PAST Amma ‘Kofi shot Amma with the arrow’ (c) Instrumental constructions: V is take or a " semantically similar common verb, while V denotes # of V . an action performed with the aid of the object " Kofi teki a nefi koti a brede Kofi take the knife cut the bread ‘Kofi cut the bread with a knife’ (Sranan, Caribbean Creole) (d) Datie constructions: V is usually gie, with an # object which denotes the (semantic) indirect object of V. "Ogyaw ne sika ma4 a4 me he-leave-PAST his money give-PAST me ‘‘He left me his money’’ (Twi, West African) (e) Comparatie constructions: a verb meaning pass or surpass is used with an object to express comparison. In this case ‘V ’ is often in fact an adjective. Amba tranga pasa "Kofi Amba strong pass Kofi ‘Amba is stronger than Kofi’ (Sranan, Caribbean Creole) (f ) Resultatie constructions: V denotes the result or # by V . consequence of an action denoted " Kofi naki Amba kiri Kofi hit Amba kill ‘Kofi struck Amba dead’ (or more literally, ‘Kofi hit Amba and killed her’) (Sranan, Caribbean Creole) (g) Idiomatic constructions (lexical idioms): These are cases where the meaning of the verbs together is not derivable from the meanings of the verbs separately. They are not found in all serializing languages, but some of the West African group are particularly rich in them. Anyi-Baule (West African): bu ‘hit’ nı4 a4 ‘look’ bu…nı4 a4 ‘say, tell’ Yoruba (West African): la' ‘cut open’ ye! … ‘understand’ la' …ye! ‘explain’
3. Distribution in the World’s Languages Serial verb constructions have been identified in many languages but it is clear that they occur in areal clusters 13928
and are not spread evenly among the languages of the world. So far serializing languages have been identified in the following areas: West Africa (Kwa and related language families), the Caribbean (Creole languages which have a historical relationship with Kwa languages), Central America (e.g., Misumalpan), Papua New Guinea (Papuan languages and Tok Pisin— Melanesian Pidgin English), South-east Asia (e.g., Chinese, Vietnamese, Thai). There are isolated reports of SVC-like constructions from elsewhere. How credible these are depends on which criteria are adopted to define SVCs.
4. Grammatical Analyses Numerous researchers have put forward proposals for grammatical analyses of SVCs. The analyses may be classified into two types, with some degree of overlap. Semantic analyses typically seek to account for SVCs in terms of how they break down verbal concepts into more basic semantic components (take–carry–come for bring, for example). Syntactic analyses differ mainly according to whether they treat SVCs as a phrase structure phenomenon (usually under a version of X-bar theory), as a form of complementation, subordination, or secondary predication (i.e., involving more than one clause-like unit), or as a lexical phenomenon (e.g., as single but disjoint lexical items (see Syntax)). The following phrase structure (4) proposed for SVCs in the literature is typical of those advocated by a number of researchers: (4)
VP
V XP VP where X is N or P
This produces a right-branching tree with a theoretically infinite number of verbs in a series. Such a structure would allow lexical relations and relations like ‘subject-of’ and ‘object-of’ to hold between verbs in the series, and captures the intuition that verbs (or verb phrases) function as complements to verbs earlier in the sequence. Some researchers have treated SVCs as involving complementation or subordination, with clauses or clause-like units embedded within the VP and dependent on a higher verb. For example, Larson (1991, pp. 198–205) has argued in favor of analyzing serial constructions as a case of ‘secondary predication.’ A structure similar to that for Carol rubbed her finger raw in English can also, he says, account for serial combinations of the take… come and hit… kill types (a) and (f ) above). Foley and Olson (1985) propose a distinction between ‘nuclear’ and ‘core’ serialization. In nuclear serialization the serial verbs all occur in the nucleus of the clause, in other words the verbs form a single unit which must share arguments and modifiers. The core layer of the clause consists of the nuclear layer plus the
Serial Verb Constructions core arguments of the verb (Foley and Olson 1985, p. 34). In ‘core’ serialization two cores, each with own nucleus and arguments, are joined together. Languages may exhibit both kinds of serialization or just one. Examples (a) and (b) below (Foley and Olson 1985, p. 38) illustrate this difference; (a) is an example of core, and (b) of nuclear, serialization:
creoles which had serializing substrata’ (1997, p. 39). SVCs have thus become a major site of contention between ‘substratists’ and ‘universalists’ in Creole studies, with both sides remaining committed to their own positions (see Pidgin and Creole Languages).
(5) (a) fu fi fase isoe he sit letter write ‘He sat down and wrote a letter’ (Barai, Papuan) (b) Fu fase fi isoe He letter sit write ‘He sat writing a letter’
6. SVCs and Language Change
This evidence suggests that ‘serialization’ may not be a unitary phenomenon, and although a single account which explains the two or more different construction types (such as Foley and Olson’s) would be welcome, it may be that separate analyses may be needed as more different types of SVC come to light.
5. The Creolists’ Debates SVCs are among the most salient grammatical features of many Creoles of the Caribbean area which distinguish them grammatically from their European lexifier languages. As such they have attracted attention from creolists, who, finding similar structures in West African languages which historically are connected with the Creoles in question, have claimed SVCs are evidence of substrate influence on the grammar of the Creoles concerned. An alternative view has been offered by Bickerton (1981) and Byrne (1987), who regard the similarities between SVCs in the Creole and West African languages as coincidental. Instead, they argue, SVCs result in Creoles from universal principles of grammar, and are a consequence of the rudimentary verbal structure (V but not VP) which exist, they say, in a Creole at the earliest stage of development. They point out the existence of SVCs or serial-like structures in other pidgins and Creoles such as Tok Pisin (New Guinea Pidgin) and Hawaiian Creole English, which have no historical connections with West Africa. McWhorter (1997) argues against Bickerton and Byrne, claiming that the Caribbean Creoles not only share SVCs with West African languages, but that their SVCs are structurally similar to each other in ways that SVCs from other areal language groupings are not. Using the nuclear\core distinction proposed by Foley and Olson (see above), he argues that Papuan languages serialize at the nuclear level, while Kwa languages, Caribbean Creoles, Chinese, and other Southeast Asian languages serialize at the core level. Considering also the range of pidgins and Creoles which lack SVCs altogether, he concludes that ‘SVCs have appeared around the world in precisely the
Particular verbs which participate in serial constructions in some cases appear to have been reanalyzed as members of another syntactic category, e.g., from verb to preposition (gie for) or complementizer (say that). In such cases, the position of the verb usually makes it amenable to such a reanalysis (e.g., if it typically occurs immediately before its objects, it may be reanyalzed as a preposition if the semantics encourage this interpretation). Reanalyzed serial verbs lose, to varying degrees, their verbal properties and take on, again to varying degrees, the morphological characterisics of their new category. Lord (1973) describes cases of reanalysis of verbs as prepositions, comitative markers, and a subordinating conjunction in Yoruba, Ga4 , Ewe, and Fon. Verbs with the meaning ‘say’ are susceptible to reinterpretation as complementizers (introducing sentential clauses), while those with the meaning ‘finish’ have a tendency to be interpreted as completive aspect markers (see Grammaticalization).
7. Related Constructions Lack of a satisfactory definition of ‘serial verb construction’ makes it difficult to decide what may count as a related kind of structure. Chinese has a range of verbal structures which bear resemblances to SVCs. One of these, the class of co-erbs, is widely believed to be the result of the reanalysis of serial verbs as prepositions. Others, so-called resultaties, resemble secondary predications. The Bantu languages, though related to the serializing languages of West Africa, do not seem to have SVCs. However, some have verbal chains where the second and subsequent verb have different marking from the first. In many Bantu languages, the verb meaning say is homophonous with a complementizer. These phenomena remain to be explained by a general theory. See also: Syntactic Aspects of Language, Neural Basis of; Syntax; Valency and Argument Structure in Syntax
Bibliography Bickerton D 1981 Roots of Language. Karoma, Ann Arbor, MI Byrne F 1987 Grammatical Relations in a Radical Creole: Verb Complementation in Saramaccan. Benjamins, Amsterdam
13929
Serial Verb Constructions Foley W A, Olson M 1985 Clausehood and verb serialization. In: Nichols J A, Woodbury A C (eds.) Grammar Inside and Outside the Clause: Some Approaches to Theory from the Field. Cambridge University Press, Cambridge, UK, pp. 17–60 Larson R K 1991 Some issues in verb serialization. In: Lefebvre C (ed.) Serial Verbs: Grammatical Comparatie and Cognitie Approaches. Benjamins, Amsterdam, pp. 184–210 Lefebvre C J (ed.) 1991 Serial Verbs: Grammatical, Comparatie and Cognitie Approaches. Benjamins, Amsterdam Lord C 1973 Serial verbs in transition. Studies in African Lingusitics 4(3): 269–96 McWhorter J H 1997 Towards a New Model of Creole Genesis. Peter Lang, New York Sebba M 1987 The Syntax of Serial Verbs: An Inestigation into Serialization in Sranan and other Languages. Benjamins, Amsterdam
M. Sebba
Service Economy, Geography of Producer services are types of services demanded primarily by businesses and governments, used as inputs in the process of production. We may split the demand for all services broadly into two categories: (a) demands originating from household consumers for services that they use, and (b) demands for services originating from other sources. An example of services demanded by household consumers is retail grocery services, while examples of producer services are management consulting services, advertizing services, and computer systems engineering.
1. Deelopment of the Term ‘Producer Serices’ Producer Services have emerged as one of the most rapidly growing industries in advanced economies, while the larger service economy has also exhibited aggregate growth in most countries. In the 1930s, Fisher (1939) observed the general tendency for shifts in the composition of employment as welfare rose, with an expanding service sector. Empirical evidence of this transformation was provided by Clark (1957), and the Clark–Fisher model was a popular description of development sequences observed in nations that were early participants in the Industrial Revolution. However, this model was criticized as a necessary pattern of economic development by scholars who observed regions and nations that did not experience the pattern of structural transformation encompassed in the Clark–Fisher model. Moreover, critics of the Clark–Fisher model also argued that the size of the service economy relative to that of goods and primary production required efforts to classify the service economy into categories more meaningful than Fisher’s residual category ‘services.’ 13930
The classification of service industries based upon their nature and their source of demand was pioneered in the 1960s and 1970s. The classic vision of services developed by Adam Smith (‘they perish the very instant of their performance’) was challenged as scholars recognized that services such as legal briefs or management services can have enduring value, similar to investments in physical capital. The term producer services became applied to services with ‘intermediate’ as opposed to ‘final’ markets. While this distinction between intermediate or producer services and final or consumer services is appealing, it is not without difficulties. Some industries do not fit neatly into one category or another. For example, households as well as businesses and governments demand legal services, and hotels—a function generally regarded to a consumer service—serve both business travelers and households. Moreover, many of the functions performed by producer service businesses are also performed internally by other businesses. An example is the presence of in-house accounting functions in most businesses, while at the same time there are firms specializing in performing accounting services for their clients. Notwithstanding these difficulties of classification, there is now widespread acceptance of the producer services as a distinctive category of service industry. However, variations in the classification of industries among nations, as well as differences in the inclusion of specific industries by particular scholars leads to varying sectoral definitions of producer services. Business, legal, and engineering and management services are included in most studies, while the financial services are often considered to be a part of the producer services. It is less common to consider services to transportation, wholesaling, and membership organizations as a component of the producer services. Table 1 documents for the US economy growth in producer services employment between 1985 and 1995 by broad industrial groups. The growth of producer services over this time period was double the national rate of job growth, and this growth rate was almost identical in metropolitan and rural areas. While financial services grew relatively slowly, employment growth in business and professional services was very strong. Key sectors within this group include temporary help agencies, advertizing, copy and duplicating, equipment rental, computer services, detective and protective, architectural and engineering, accounting, research and development, and management consulting and public relations services.
2. Early Research on the Geography of Producer Serices Geographer’s and regional economist’s research on producer services only started in the 1970s. Pioneering
Serice Economy, Geography of Table 1 US employment change in producer Services, 1985–95 Sector
Job growth (thousands)
Percentage growth
994 4,249 275 599 6,117 19,039
16.6 79.2 40.0 38.6 45.0 23.4
Finance, insurance, and real estate Business and professional services Legal services Membership organizations Total, producer services Total, all industries
research includes the investigations of office location in the United Kingdom by Daniels (1975), and the studies of corporate control, headquarters, and research functions related to corporate headquarters undertaken in the United States by Stanback (1979). Daniels (1985) provided a comprehensive summary of this research, distinguishing between theoretical approaches designed to explain the geography of consumer and producer services. The emphasis in this early research focused on large metropolitan areas, recognizing the disproportionate concentration of producer services employment in the largest nodal centers. This concentration was viewed as a byproduct of the search for agglomeration economies, both within freestanding producer services and in the research and development organizations associated with corporate headquarters. This early research also identified the growth of international business service organizations that were concentrated in large metropolitan areas to serve both transnational offices of home-country clients and to generate an international client base. Early research on producer services also documented tendencies for decentralization from central business districts into suburban locations, as well as the relatively rapid development of producer services in smaller urban regions. In the United States, Noyelle and Stanback (1983) published the first comprehensive geographical portrait of employment patterns within the producer services, differentiating their structure through a classification of employment, and relating this classification to growth trends between 1959 and 1976. While documenting relatively rapid growth of producer services in smaller urban areas with relatively low employment concentrations, this early research did not point towards employment decentralization, although uncertainties related to the impact of the development of information technologies were recognized as a factor that would temper future geographical trends. The geography of producer services was examined thoroughly in the United Kingdom through the use of secondary data in the mid-1980s (Marshall et al. 1988). This research documented the growing importance of the producer services as a source of employment, in an economy that was shedding manufacturing jobs. It encouraged a richer appreciation of the role of
producer services by policy-makers and scholars and called for primary research to better understand development forces within producer service enterprises. While many scholars in this early period of research on producer services emphasized market linkages with manufacturers or corporate headquarter establishments, other research found industrial markets to be tied broadly to all sectors of the economy, a fact also borne out in input-output accounts. Research also documented geographic markets of producer service establishments, finding that a substantial percentage of revenues were derived from nonlocal clients, making the producer services a component of the regional economic base (Beyers and Alvine 1985).
3. Emphasis in Current Research on the Geography of Producer Serices In the 1990s there has been an explosion of geographic research on producer services. Approaches to research in recent years can be grouped into studies based on secondary data describing regional patterns or changes in patterns of producer service activity, and studies utilizing primary data to develop empirical insights into important theoretical issues. For treatment of related topics see Location Theory; Economic Geography; Industrial Geography; Finance, Geography of; Retail Trade; Urban Geography; and Cities: Capital, Global, and World.
3.1 Regional Distribution of Producer Serices The uneven distribution of producer service employment continues to be documented in various national studies, including recent work for the United States, Canada, Germany, and the Nordic countries. While these regions have exhibited varying geographic trends in the development of producer services, they share in common the relatively rapid growth of these industries. They also share in common the fact that relatively few regions have a concentration of or a share of employment at or above the national average. In Canada, the trend has been towards greater concentration, a pattern which Coffey and Shearmur 13931
Serice Economy, Geography of (1996) describe as uneven spatial development. Utilizing data for the 1971 to 1991 time period, they find a broad tendency for producer service employment to have become more concentrated over time. The trend in the United States, Germany, and the Nordic countries differs from that found in Canada. In the Nordic countries producer service employment is strongly concentrated in the capitals, but growth in peripheral areas of Norway and Finland is generally well above average, while in Denmark and Sweden many peripheral areas exhibit slow growth rates (Illeris and Sjøholt 1995). In Germany, the growth pattern exhibits no correlation between city size and the growth of business service employment (Illeris 1996). The United States also exhibits the uneven distribution of employment in the producer services. In 1995 only 34 of the 172 urban-focused economic areas defined by the US Bureau of Economic Analysis had an employment concentration in producer services at or above the national average, and these were predominately the largest metropolitan areas in the country. However, between 1985 and 1995 the concentration of employment in these regions diminished somewhat, while regions with the lowest concentrations in 1985 tended to show increases in their level of producer services employment. Producer service growth rates in metropolitan and rural territory have been almost identical over this same time period in the United States. There is also considerable evidence of deconcentration of employment within metropolitan areas from central business districts into ‘Edge Cities’ and suburban locations in Canada and the United States (Coffey et al. 1996a, Coffey et al. 1996b, Harrington and Campbell 1997). Survey research in Montreal indicates that this deconcentration is related more strongly to new producer service businesses starting up in suburban locations than to the relocation of existing establishments from central city locations (Coffey et al. 1996b).
3.2 Producer Serices and Regional Deelopment Numerous studies have now been undertaken documenting the geographic markets of producer service establishments; for summaries see Illeris (1996) and Harrington et al. (1991). Illeris’ summary of these studies indicates typical nonlocal market shares at 35 percent, but if nonlocal sales are calculated by weighting the value of sales the nonlocal share rises to 56 percent. In both urban and rural settings establishments are divided into those with strong interregional or international markets, and those with primarily localized markets. Over time establishments tend to become more export oriented, with American firms tending to have expanded interregional business, while European firms more often enter international markets (O’Farrell et al. 1998). Thus, producer services contribute to the economic base of communities and 13932
their contribution is rising over time due to their relatively rapid growth rate. Growth in producer service employment in the United States has occurred largely through the expansion in the number of business establishments, with little change in the average size of establishments. Between 1983 and 1993 the number of producer service establishments increased from 0.95 million to 1.34 million, while average employment per establishment increased from 11 to 12 persons. The majority of this employment growth occurred in single unit establishments, nonpayroll proprietorships, and partnerships. People are starting these new producer service establishments because they want to be their own boss, they have identified a market opportunity, their personal circumstances lead them to wish to start a business, and they have identified ways to increase their personal income. Most people starting new companies were engaged in the same occupation in the same industry before starting their firm, and few report that they started companies because they were put out of work by their former employer in a move designed to downsize in-house producer service departments (Beyers and Lindahl 1996a). The use of producer services also has regional development impacts. Work in New York State (MacPherson 1997) has documented that manufacturers who make strong use of producer services as inputs are more innovative than firms who do not use these services, which in turn has helped stimulate their growth rate. Similar positive impacts on the development of clients of producer service businesses has been documented in the United Kingdom (Wood 1996) and more broadly in Europe (Illeris 1996).
3.3 Demand Factors Considerable debate has raged over the reasons for the rapid growth of the producer services, and one common perception has been that this growth has occurred because of downsizing and outsourcing by producer service clients to save money on the acquisition of these services. However, evidence has accumulated that discounts the significance of this perspective. Demand for producer services is increasing for a number of reasons beyond the cost of acquiring these services, based on research with both the users of these services and their suppliers (Beyers and Lindahl 1996a, Coffey and Drolet 1996). Key factors driving the demand for producer services include: the lack of expertise internal to the client to produce the service, a mismatch between the size of the client and the volume of their need for the service, the need for third-party opinions or expert testimony, increases in government regulations requiring use of particular services, and the need for assistance in managing the complexity of firms or to stay abreast of new technologies.
Serice Economy, Geography of Evidence also indicates that producer service establishments often do business with clients who have in-house departments producing similar services. However, it is rare that there is direct competition with these in-house departments. Instead, relationships are generally complementary. Current evidence indicates that there have been some changes in the degree of inhouse vs. market acquisition of producer services, but those selling these services do not perceive the balance of these in-house or market purchases changing dramatically (Beyers and Lindahl 1996a). 3.4 Supply and Competitie Adantage Considerations The supply of producer services is undertaken in a market environment that ranges from being highly competitive to one in which there is very little competition. In order to position themselves in this marketplace, producer service businesses exhibit competitive strategies, and as with the demand side, recent evidence points towards the prevalence of competitive strategies based on differentiation as opposed to cost (Lindahl and Beyers 1999, Hitchens et al. 1996, Coffey 1996). The typical firm tries to develop a marketplace niche, positioning itself to be different from competitors though forces such as an established reputation for supplying their specialized service, the quality of their service, their personal attention to client needs, their specialized expertise, the speed with which they can perform the service, their creativity, and their ability to adapt quickly to client needs. The ability to deliver the service at a lower cost than the client could produce it is also a factor considered important by some producer service establishments. 3.5 Flexible Production Systems The flexible production paradigm has been extended from its origin in the manufacturing environment in recent years into the producer services (Coffey and Bailly 1991). There are a variety of aspects to the issue of flexibility in the production process, including the nature of the service being produced, the way in which firms organize themselves to produce their services, and their relationships with other firms in the production and delivery process. The labor force within producer services has become somewhat more complex, with the strong growth of the temporary help industry that dispatches a mixture of part-time and full-time temporary workers, as well as some increase in the use of contract or part-time employees within producer service establishments. However, over 90 percent of employment remains full-time within the producer services (Beyers and Lindahl 1999). The production of producer services is most frequently undertaken in a manner that requires the labor force to be organized in a customized manner for each job, although a minority of firm’s approach work
in a routinized manner. Almost three-fourths of firms rely on outside expertise to produce their service, and half of producer service firms engage in collaboration with other firms to produce their services to extend their range of expertise or geographic markets. The services supplied by producer service establishments are also changing frequently; half of the establishments interviewed in an US study had changed their services over the previous five years. They do so for multiple reasons, including changing client expectations, shifts in geographic or sectoral markets, changes in government regulations, and changes in information technologies with their related changes on the skills of employees. These changes most frequently produce more diversified service offerings, but an alternative pathway is to become more specialized or to change the form or nature of services currently being produced (Beyers and Lindahl 1999).
3.6 Information Technologies Work in the producer services typically involves face to face meetings with clients at some point in the production process. This work may or may not require written or graphical documents. In some cases the client must travel to the producer service firm’s office, and in other cases the producer service staff travels to client. These movements can be localized or international in scale. However, in addition to these personal meetings, there is extensive and growing use of a variety of information technologies in the production and delivery of producer services work. Computer networks, facsimile machines, telephone conversations, courier services, and the postal system all play a role, including a growing use of the Internet, e-mail, and computer file transfers between producers and clients. Some routine functions have become the subject of e-commerce, but much of the work undertaken in the producer services is nonroutine, requiring creative interactions between clients and suppliers.
3.7 Location Factors The diversity of market orientations of producer service establishments leads to divergent responses with regard to the choice of business locations (Illeris 1996). Most small establishments are located convenient to the residential location of the founder. However, founders often search for a residence that suits them, a factor driven by quality of life considerations for many rural producer service establishments (Beyers and Lindahl 1996b). Businesses with spatially dispersed markets are drawn to locations convenient to travel networks, including airports and the Interstate Highway System in the US. Businesses 13933
Serice Economy, Geography of with highly localized markets position themselves conveniently to these markets. Often ownership of a building, or a prestigious site that may affect trust or confidence of clients is an influencing factor. Establishments that are part of large multi-establishment firms select locations useful from a firm-network-oflocations perspective, which may be either an intra- or interurban pattern of offices.
4. Methodological Considerations Much of the research reported upon here has been conducted in the United States, Canada, or the United Kingdom. While there has been a rapid increase in the volume of research reported regarding the themes touched upon in the preceding paragraphs, the base of field research is still slender even within the nations just mentioned. More case studies are needed in urban and rural settings, focused on marketplace dynamics, production processes, and the impact of technological development on the evolution of the producer services. The current explosion of e-commerce and its use by producer service firms is a case in point. Recent research has tended to be conducted either by surveying producer service enterprises or their clients, but only rarely are both surveyed to obtain answers that can be cross-tabulated. Individual researchers have developed their protocol for survey research, yielding results that tend to be noncomparable to other research. Means need to be developed to bring theoretical and empirical approaches into greater consistency and comparability. There are variations in the organization of production systems related to the producer services among countries. Some countries internalize these functions to a relatively high degree within other categories of industrial activity, and there needs to be greater thought given to ways in which research on these functions within these organizations can be measured.
5. Future Directions of Theory and Research Research on the producer services will continue at both an aggregate scale, as well as at the level of the firm or establishment. Given the recent history of job growth in this sector of advanced economies, there will certainly be studies documenting the ongoing evolution of the geographical distribution of producer services. This research needs to proceed at a variety of spatial scales, ranging from intrametropolitan, to intermetropolitan or interregionally within nations, and across nations. International knowledge of development trends is currently sketchy, especially in developing countries where accounts may be less 13934
disaggregate than in developed countries with regard to service industries. Research at the level of the firm and establishment must continue to better understand the start-up and evolution of firms. While there is a growing body of evidence regarding motivations and histories of firm founders, there is currently little knowledge of the movement of employees into and among producer service establishments. There is also meager knowledge of the distribution of producer service occupations and work in establishments not classified as producer services. Case studies are needed of collaboration, subcontracting, client-seller interaction, processes of price formation, the interplay between the evolution of information technologies and serviceproduct concepts, and types of behavior that are related to superior performance as measured by indicators such as sales growth rate, sales per employee, or profit. There is also a pressing need for international comparative research, not just in the regions that have been relatively well researched (generally the US, Canada, and Europe), but also in other parts of the planet. There is also a pressing need for the development of more robust models and theory in relation to the geography of producer services. See also: Geodemographics; Market Areas; Services Marketing
Bibliography Beyers W B, Alvine M J 1985 Export services in postindustrial society. Papers of the Regional Science Association 57: 33–45 Beyers W B, Lindahl D P 1996a Explaining the demand for producer services: Is cost-driven externalization the major factor? Papers in Regional Science 75: 351–74 Beyers W B, Lindahl D P 1996b Lone eagles and high fliers in rural producer services. Rural Deelopment Perspecties 12: 2–10 Beyers W B, Lindahl D P 1999 Workplace flexibilities in the producer services. The Serice Industries Journal 19: 35–60 Clark C 1957 The Conditions of Economic Progress. Macmillan, London Coffey W J 1996 Forward and backward linkages of producer service establishments: Evidence from the Montreal metropolitan area. Urban Geography 17: 604–32 Coffey W J, Bailly A S 1991 Producer services and flexible production: An exploratory analysis. Growth and Change 22: 95–117 Coffey W J, Drolet R 1996 Make or buy? Internalization and externalization of producer service inputs in the Montreal metropolitan area. Canadian Journal of Regional Science 29: 25–48 Coffey W J, Drolet R, Pole' se M 1996a The intrametropolitan location of high order services: Patterns, factors and mobility in Montreal. Papers in Regional Science 75: 293–323 Coffey W J, Pole' se M, Drolet R 1996b Examining the thesis of central business district decline: Evidence from the Montreal metropolitan area. Enironment and Planning A 28: 1795–1814
Serices Marketing Coffey W J, Shearmur R G 1996 Employment Growth and Change in the Canadian Urban System, 1971–94. Canadian Policy Research Networks, Ottawa, Canada Daniels P W 1975 Office Location. Bell, London Daniels P W 1985 Serice Industries, A Geographical Appraisal. Methuen, London Fisher A 1939 Production, primary, secondary, and tertiary. Economic Record 15: 24–38 Harrington J W, Campbell H S Jr 1997 The Suburbanization of Producer Service Employment. Growth and Change 28: 335–59 Harrington J W, MacPherson A D, Lombard J R 1991 Interregional trade in producer services: Review and synthesis. Growth and Change 22: 75–94 Hitchens D M W N, O’Farrell P N, Conway C D 1996 The competitiveness of business services in the Republic of Northern Ireland, Wales, and the south East of England. Enironment and Planning A 28: 1299–1313 Illeris S 1996 The Serice Economy. A Geographical Approach. Wiley, Chichester, UK Illeris S, Sjøholt P 1995 The Nordic countries: High quality services in a low density environment. Progress Planning 43: 205–221 Lindahl D P, Beyers W B 1999 The creation of competitive advantage by producer service establishments. Economic Geography 75: 1–20 MacPherson A 1997 The role of producer service outsourcing in the innovation performance of New York State manufacturing firms. Annals of the Association of American Geographers 87: 52–71 Marshall J, Wood P A, Daniels P W, McKinnon A, Bachtler J, Damesick P, Thrift N, Gillespie A, Green A, Leyshon A 1988 Serices and Uneen Deelopment. Oxford University Press, Oxford, UK Noyelle T J, Stanback T M 1983 The Economic Transformation of American Cities. Rowman and Allanheld, Totowa, NJ O’Farrell P N, Wood P A, Zheng J 1998 Regional influences on foreign market development by business service companies: Elements of a strategic context explanation. Regional Studies 32: 31–48 Stanback T M 1979 Understanding the Serice Economy: Employment, Productiity, Location. Allanheld and Osmun, Totowa, NJ Wood P 1996 Business services, the management of change and regional development in the UK: A corporate client perspective. Transactions of the Institute of British Geographers 21: 649–65
W. B. Beyers
Services Marketing Marketing, as a philosophy and as a function, has already reached the maturity stage. In the 1950s, the basic emphasis was on consumer goods and the mass marketing approach; in the 1960s, it was the marketing potentiality of durable goods; and in the 1970s; the emphasis was on industrial goods. Only in the 1980s did service organizations start to take a professional interest in marketing approaches
and tools. More recently also, the public services, and the nonprofit services in general, have begun to participate in marketing. Despite its recent arrival, services marketing is undoubtedly the most innovative and, thanks to new technology, seems as though it may change the old approach completely. The trend is moving from mass marketing to one-to-one marketing, a typical feature of services promotion.
1. Peculiarities of Serices Marketing If services marketing is becoming so important, it is necessary to understand its peculiarities. Everything is related to the intangibility of serices: they are ‘events’ more than ‘goods,’ and as a consequence it is impossible to store them. They are ‘happenings,’ where it is difficult to forecast what will be produced and what the consumer will get. The relationship becomes increasingly important, especially when the consumer becomes a ‘part-time producer’—a ‘pro-sumer’ ( producerjconsumer). Interaction with people is very important in many services, such as restaurants, air transportation, tourism, banking, and so on. The interaction is not only between producer and consumer, but also among the users, as happens in schools among students. Because of these elements there is another very important aspect: in services marketing it is difficult to standardize performance, and quality can be very different from one case to another. Quality control ahead of the event is impossible, because nobody knows in advance the service that will actually be provided to the customer. Many external factors can affect the level of quality, e.g., the climate, or the unpredictable behavior of customers. Consequently, it is not always possible to fulfill promises at the point of delivery, even if the organization does its best to deliver the required level of quality. Another peculiarity is that services are not ‘visible’ and so cannot represent a ‘status symbol,’ unless the service can be matched with tangible objects, such as credit cards, university ties, football scarves, etc. But the most important aspect is that services cannot be stored, and therefore production and use are simultaneous. This creates problems in terms of production capacity, because it is difficult to match fluctuations in demand over seasons, weeks, or days. It is possible to synchronize production and use by means of different tools. The most used approach is pricing, because varying service prices may be applied at different periods of time (e.g., telephone charges, or airline tickets). There are other tools, such as ‘user education,’ implemented by the service producer. In this case, the producer tries to ‘educate’ the consumer to ask for a particular, better-quality, service during the so-called ‘valley periods’ when sales are slow. Other solutions involve the employment of parttime workers during ‘peak periods,’ or the implemen13935
Serices Marketing tation of maintenance activity in the ‘valley periods’; making reservations is a typical approach in the case of theatres or in the health care service. In all cases it is necessary to overcome fluctuating demand with a ‘synchromarketing activity.’
2. Critical Success Factors in Serices Marketing Because of the peculiarities of the services, there are many critical success factors in services marketing: (a) service industrialization, (b) strategic image management, (c) customer satisfaction surveys, (d) operations, personnel, and marketing interaction, (e) internal marketing, (f) user education, and participation, (g) managerial control of costs, and investments, (h) relationship with the public authorities, (i) quality control, and (j) synchromarketing. Of these, further comment can be devoted to internal marketing, whose goal is ‘to create an internal environment which supports customer consciousness and sales mindedness among the personnel’ (Gronroos 1990). ‘Internal marketing means applying the philosophy and practices of marketing to people who serve the external customers, so that the best possible people can be employed and retained, and they will do the best possible work’ (Berry and Parasuraman 1991). Another point to emphasize is quality control, because, as has been said, the quality of the service is perceived to be more than objective. Many studies say that the focal point in services is the perceived quality, as a result of a comparison between the expected quality and the image of the service really provided. As a consequence, it is very important to survey customer opinion on a regular basis, and to check whether, via this opinion, service is becoming better or worse. A further point is related to the importance of a company’s image as a factor affecting, positively or negatively, customers’ judgment. This is why it is necessary to manage a company’s image with a strongly strategic approach, outlining what the company is, what people think it is, and how it wants people to think of it. Image improvement is not only a matter of communication; it also involves personnel behavior, services provided, and material facilities.
3. Future Perspecties At the start of the twenty-first century, services marketing is very important in the developed countries, because of the relevance of services in terms of GNP, employment, etc., and we can predict that it is going to increase further in importance, thanks to new technologies that facilitate the relationship between producer and customer. There will be an increasing amount of one-to-one marketing, which is particularly
13936
suitable for services. Another important development will concern the internationalization of services (e.g., McDonald’s, Club Med, Manchester United, Hertz, etc.), where the critical success factors will be related to the improvement of ‘high tech’ in the back office and ‘high touch’ in the front line: Industrialization and personalization at the same time, to achieve high customer satisfaction in every country. Sales of products now often include the offer of some services, as, for example, in the automotive industry, where this is represented by the after sales service, a very important aspect affecting the buyer decision process. Offers such as this are the key issues that, in most cases, actually make the difference, given the fact that the basic product is very frequently similar to other products, and the difference is provided by the service included in the product. See also: Advertising Agencies; Advertising and Advertisements; Advertising: General; Advertising, Psychology of; Computers and Society; Internet: Psychological Perspectives; Market Research; Marketing Strategies; Markets: Artistic and Cultural; Service Economy, Geography of
Bibliography Berry L L, Parasuraman A 1991 Marketing Serices— Competing Through Quality. The Free Press Cherubini S 1981 Il Marketing dei Serizi. Franco Angeli Editore Cowell D 1984 The Marketing of Serices. Heinemann Donnelly J H, George W R (eds.) 1981 Marketing of Serices, AMA’s Proceedings Series. Eiglier P, Langeard E 1987 Seruction—Le Marketing des Serices. McGraw-Hill, New York Eiglier P, Langeard E, Lovelock C H, Bateson J E G, Young R F 1977 Marketing Consumer Serices: New Insights. Marketing Science Institute Fisk R P 2000 Interactie Serices Marketing. Hougthon Mifflin Gronroos C 1990 Serice Management and Marketing. Managing the Moments of Truth in Serice Competition. Lexington Books Heskett J L, Sasser W E, Hart C W L 1986 Serice Breakthroughs. Changing the Rules of the Game. The Free Press, New York Lovelock C H 1984 Serices Marketing. Prentice-Hall, Englewood Cliffs, NJ Normann R 1984 Serice Management. Strategy and Leadership in Serice Businesses. Wiley, New York Payne A 1993 The Essence of Serices Marketing. Prentice-Hall, Englewood Cliffs, NJ Payne A, McDonald M H B 1997 Marketing Planning for Serices. Butterworth-Heinemann, Oxford, UK Zeithamal V A 1996 Serices Marketing. McGraw-Hill, New York
S. Cherubini Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
Set Settlement and Landscape Archaeology For contemporary archaeology, settlement and landscape approaches represent an increasingly important focus that is vital for a core mission of the discipline to describe, understand, and explain long-term cultural and behavioral change. Despite this significance, few syntheses of this topic have been undertaken (cf. Parsons 1972, Ammerman 1981, Fish and Kowalewski 1990, Billman and Feinman 1999). Yet settlement and landscape approaches provide the only large-scale perspective for the majority of premodern societies. These studies are reliant on archaeological surface surveys, which discover and record the distribution of material traces of past human presence\ habitation across a landscape (see Surey and Excaation (Field Methods) in Archaeology). The examination and analysis of these physical remains found on the ground surface (e.g., potsherds, stone artifacts, house foundations, or earthworks) provide the empirical foundation for the interpretation of ancient settlement patterns and landscapes.
1.
Historical Background
Although the roots of settlement pattern and landscape approaches extend back to the end of the nineteenth century, archaeological survey has only come into its own in the post World War II era. Spurred by the analytical emphases of Steward (1938), Willey’s Viru! Valley archaeological survey (1953) provided a key impetus for settlement pattern research in the Americas. In contrast, the landscape approach, which has a more focal emphasis on the relationship between sites and their physical environments, has its roots in the UK. Nevertheless, contemporary archaeological studies indicate a high degree of intellectual cross-fertilization between these different surface approaches.
1.1 Early Foundations for Archaeological Surey in the Americas and England The American settlement pattern tradition stems back to scholars, such as Morgan (1881), who queried how the remnants of Native American residential architecture reflected the social organization of the native peoples who occupied them. Yet the questions posed
by Morgan led to relatively few immediate changes in how archaeology was practiced, and for several decades few scholars endeavored to address the specific questions regarding the relationship between settlement and social behavior that Morgan posed. When surface reconnaissance was undertaken by archaeologists, it tended to be a largely unsystematic exercise carried out to find sites worthy of excavation. In the UK, the landscape approach, pioneered by Fox (1922), was more narrowly focused on the definition of distributional relationships between different categories of settlements and environmental features (e.g., soils, vegetation, topography). Often these early studies relied on and summarized surveys and excavations that were carried out by numerous investigators using a variety of field procedures rather than more uniform or systematic coverage implemented by a single research team. At the same time, the European landscape tradition generally has had a closer link to romantic thought as opposed to the more positivistic roots of the North American settlement pattern tradition (e.g., Sherratt 1996). 1.2
The Deelopment of Settlement Archaeology
By the 1930s and 1940s, US archaeologists working in several global regions recognized that changing patterns of social organization could not be reconstructed and interpreted through empirical records that relied exclusively on the excavation of a single site or community within a specific region. For example, in the lower Mississippi Valley, Phillips et al. (1951) located and mapped archaeological sites across a large area to analyze shifting patterns of ceramic styles and settlements over broad spatial domains and temporal contexts. Yet the most influential and problem-focused investigation of that era was that of Willey in the Viru! Valley. Willey’s project was the first to formally elucidate the scope and potential analytical utility of settlement patterns for understanding long-term change in human economic and social relationships. His vision moved beyond the basic correlation of environmental features and settlements as well as beyond the mere definition of archtypical settlement types for a given region. In addition to its theoretical contributions, the Viru! program also was innovative methodologically, employing (for the first time in the Western Hemisphere) vertical air photographs in the location and mapping of ancient settlements. Al13937
Settlement and Landscape Archaeology though Willey did not carry out his survey entirely on foot, he did achieve reasonably systematic areal coverage for a defined geographic domain for which he could examine changes in the frequency of site types, as well as diachronic shifts in settlement patterns. Conceptually and methodologically, these early settlement pattern projects of the 1930s and 1940s established the intellectual underpinnings for a number of multigenerational regional archaeological survey programs that were initiated in at least four global regions during the 1950s and 1960s. In many ways, these later survey programs were integral to the theoretical and methodological re-evaluations that occurred in archaeological thought and practice under the guise of ‘the New Archaeology’ or processualism. The latter theoretical framework stemmed in part from an expressed emphasis on understanding longterm processes of behavioral change and cultural transition at the population (and so regional) scale. This perspective, which replaced a more normative emphasis on archtypical sites or cultural patterns, was made possible to a significant degree by the novel diachronic and broad scalar vantages pieced together for specific areas through systematic regional settlement pattern fieldwork and analysis. 1.3
Large-scale Regional Surey Programs
During the 1950s through the 1970s, major regional settlement pattern programs were initiated in the heartlands of three areas where early civilizations emerged (Greater Mesopotamia, highland Mexico, and the Aegean), as well as in one area known for its rich and diverse archaeological heritage (the Southwest USA). The achievements of the Viru! project also stimulated continued Andean settlement pattern surveys, although a concerted push for regional research did not take root there until somewhat later (e.g., Parsons et al. 1997, Billman and Feinman 1999). Beginning in 1957, Robert M. Adams (e.g., 1965, 1981) and his associates methodically traversed the deserts and plains of the Near East by jeep, mapping earthen tells and other visible sites. Based on the coverage of hundreds of square kilometers, these pioneering studies of regional settlement history served to unravel some of the processes associated with the early emergence of social, political, and economic complexity in Greater Mesopotamia. Shortly thereafter, in highland Mexico, large-scale, systematic surveys were initiated in the area’s two largest mountain valleys (the Basin of Mexico and the Valley of Oaxaca). These two projects implemented field-by-field, pedestrian coverage of some of the largest contiguous survey regions in the world, elucidating the diachronic settlement patterns for regions in which some of the earliest and most extensive cities in the ancient Americas were situated (e.g., Sanders et al. 1979, Blanton et al. 1993). After decades, about 13938
half of the Basin of Mexico and almost the entire Valley of Oaxaca were traversed by foot. In the Aegean, regional surveys (McDonald and Rapp 1972, Renfrew 1972) were designed to place important sites with long excavation histories in broader spatial contexts. Once again, these investigations brought new regional vantages to areas that already had witnessed decades of excavation and textual analyses. Over the same period, settlement pattern studies were carried out in diverse ecological settings across the US Southwest, primarily to examine the differential distributions of archaeological sites in relation to their natural environments, and to determine changes in the numbers and sizes of settlements across the landscape over time. In each of the areas investigated, the wider the study domain covered, the more diverse and complex were the patterns found. Growth in one part of a larger study area was often timed with the decrease in the size and number of sites in another. And settlement trends for given regions generally were reflected in episodes of both growth and decline. Each of these major survey regions (including much of the Andes) is an arid to semiarid environment. Without question, broad-scale surface surveys have been most effectively implemented in regions that lack dense ground cover, and therefore the resultant field findings have been most robust. In turn, these findings have fomented long research traditions carried out by trained crews, thereby contributing to the intellectual rewards of these efforts. As Ammerman (1981, p. 74) has recognized, ‘major factors in the success of the projects would appear to be the sheer volume of work done and the experience that workers have gradually built up over the years.’
1.4 Settlement Pattern Research at Smaller Scales of Analysis Although settlement pattern approaches were most broadly applied at the regional scale, other studies followed similar conceptual principles in the examination of occupational surfaces, structures, and communities. At the scale of individual living surfaces or house floors, such distributional analyses have provided key indications as to which activities (such as cooking, food preparation, and toolmaking) were undertaken in different sectors (activity areas) of specific structures (e.g., Flannery and Winter 1976) or surfaces (e.g., Flannery 1986, pp. 321–423). In many respects, the current emphasis on household archaeology (e.g., Wilk and Rathje 1982) is an extension of settlement pattern studies (see Household Archaeology). Both household and settlement pattern approaches have fostered a growing interest in the nonelite sector of complex societies, and so have spurred the effort to understand societies as more than just undifferentiated, normative wholes.
Settlement and Landscape Archaeology At the intermediate scale of single sites or communities, settlement pattern approaches have compared the distribution of architectural and artifactual evidence across individual sites. Such investigations have clearly demonstrated significant intrasettlement variation in the functional use of space (e.g., Hill 1970), as well as distinctions in socioeconomic status and occupational history (e.g., Blanton 1978). From a comparative perspective, detailed settlement pattern maps and plans of specific sites have provided key insights into the similarities and differences between contemporaneous cities and communities in specific regions, as well as the elucidation of important patterns of cross-cultural diversity.
the areas that they endeavor to examine to the nature of the terrain and the density of artifactual debris (generally nonperishable ancient refuse) associated with the sites in the specified region. For example, sedentary pottery-using peoples generally created more garbage than did mobile foragers; the latter usually employed more perishable containers (e.g., baskets, cloth bags). Consequently, other things being equal, the sites of foragers are generally less accessible through settlement and landscape approaches than are the ancient settlements that were inhabited for longer durations (especially when ceramics were used). 2.2
2. Contemporary Research Strategies and Ongoing Debates The expansion of settlement pattern and landscape approaches over the last decades has promoted the increasing acceptance of less normative perspectives on cultural change and diversity across the discipline of archaeology. In many global domains, archaeological surveys have provided a new regional-scale (and in a few cases, macroregional-scale) vantage on past social systems. Settlement pattern studies also have yielded a preliminary means for estimating the parameters of diachronic demographic change and distribution at the scale of populations, something almost impossible to obtain from excavations alone. Nevertheless, important discussions continue over the environmental constraints on implementation, the relative strengths and weaknesses of different survey methodologies, issues of chronological control, procedures for population estimation, and the appropriate means for the interpretation of settlement pattern data. 2.1
Enironmental Constraints
Although systematic settlement pattern and landscape studies have been undertaken in diverse environmental settings including heavily vegetated locales such as the Guatemalan Pete! n, the eastern woodlands of North America, and temperate Europe, the most sustained and broadly implemented regional survey programs to date have been enacted in arid environments. In large part, this preference pertains to the relative ease of finding the artifactual and architectural residues of ancient sites on the surface of landscapes that lack thick vegetal covers. Nevertheless, archaeologists have devised a variety of means, such as the interpolation of satellite images, the detailed analysis of aerial photographs, and subsurface testing programs, that can be employed to locate and map past settlements in locales where they are difficult to find through pedestrian coverage alone. In each study area, regional surveys also have to modify their specific field methodologies (the intensity of the planned coverage) and the sizes of
Surey Methodologies and Sampling
Practically since the inception of settlement pattern research, archaeologists have employed a range of different field survey methods. A critical distinction has been drawn between full-coverage and sample surveys. The former approaches rely on the complete and systematic coverage of the study region by members of a survey team. In order to ensure the full coverage of large survey blocks, team members often space themselves 25–50 m apart, depending on the specific ground cover, the terrain, and the density of archaeological materials. As a consequence, isolated artifact finds can occasionally be missed. But the researchers generally can discern a reasonably complete picture of settlement pattern change across a given region. Alternatively, sample surveys by definition are restricted to the investigation of only a part of (a sample of ) the study region. Frequently such studies (because they only cover sections of larger regions) allow for the closer spacing of crew members. Archaeologists have employed a range of different sampling designs. Samples chosen for investigation may be selected randomly or stratified by a range of diverse factors, including environmental variables. Nevertheless, regardless of the specific sampling designs employed, such sample surveys face the problem of extrapolating the results from their surveyed samples to larger target domains that are the ultimate focus of study. Ultimately, such sample surveys have been shown to be more successful at estimating the total number of sites in a given study region than at defining the spacing between sites or at discovering rare types of settlement. The appropriateness of sample design can only be decided by the kinds of information that the investigator aims to recover. There is no single correct way to conduct archaeological survey, but certain methodological procedures have proven more productive in specific contexts and given particular research aims. 2.3
Chronological Constraints and Considerations
One of the principal strengths of settlement pattern research is that it provides a broad-scale perspective 13939
Settlement and Landscape Archaeology on the changing distribution of human occupation across landscapes. Yet the precision of such temporal sequences depends on the quality of chronological control (see Chronology, Stratigraphy, and Dating Methods in Archaeology). The dating of sites during surveys must depend on the recovery and temporal placement of chronologically diagnostic artifacts from the surface of such occupations. Artifacts found on the surface usually are already removed from their depositional contexts. Finer chronometric dating methods generally are of little direct utility for settlement pattern research, since such methods are premised on the recovery of materials in their depositional contexts. Of course, chronometric techniques can be used in more indirect fashion to refine the relative chronological sequences that are derived from the temporal ordering of diagnostic artifacts (typically pottery). In many regions, the chronological sequences can only be refined to periods of several hundred years in length. As a result, sites of shorter occupational durations that may be judged to be contemporaneous in fact could have been inhabited sequentially. In the same vein, the size of certain occupations may be overestimated as episodes of habitation are conflated. Although every effort should be made to minimize such analytical errors, these problems in themselves do not negate the general importance of the long-term regional perspective on occupational histories that in many areas of the world can be derived from archaeological survey alone. Although the broad-brush perspective from surveys may never provide the precision or detailed views that are possible from excavation, they yield an encompassing representation at the population scale that excavations cannot achieve. Adequate holistic perspectives on past societies rely on the multiscalar vantages that are provided through the integration of wide-ranging archaeological surveys with targeted excavations.
2.4
Population Estimation
One of the key changes in archaeological thought and conceptualization over the past half-century has been the shift from essentialist\normative thinking about ancient societies to a more populational perspective. But the issue of how to define past populations, their constituent parts, and the changing modes of interaction between those parts remains challenging at best. Clearly, multiscalar perspectives on past social systems are necessary to collect the basic data required to estimate areal shifts in population size and distribution. Yet considerable debate has been engendered over the means employed by archaeologists to extrapolate from the density and dispersal of surface artifacts pertaining to a specific phase to the estimated sizes of past communities or populations. Generally, archaeologists have relied on some combination of the empirically derived size of a past 13940
settlement, along with a comparative determination of surface artifact densities at that settlement, to generate demographic estimates for a given community. When the estimates are completed for each settlement across an entire survey region, extrapolations become possible for larger study domains. By necessity, the specific equations to estimate past populations vary from one region to another because community densities are far from uniform over time or space. Yet due to chronological limitations, as well as the processes of deposition, disturbance, and destruction, our techniques for measuring ancient populations remain coarsegrained. Although much refinement is still needed to translate survey data into quantitative estimates of population with a degree of precision and accuracy, systematic regional surveys still can provide the basic patterns of long-term demographic change over time and space that cannot be ascertained in any other way.
2.5
The Interpretation of Regional Data
Beyond the broad-brush assessment of demographic trends and site distribution in relation to environmental considerations, archaeologists have interpreted and analyzed regional sets of data in a variety of ways. Landscape approaches, which began with a focused perspective on humans and their surrounding environment, have continued in that vein, often at smaller scales. Such studies often examine in detail the placement of sites in a specific setting with an eye toward landscape conservation and the meanings behind site placement (Sherratt 1996). At the same time, some landscape studies have emphasized the identification of ancient agrarian features and their construction and use. In contrast, the different settlement pattern investigations have employed a range of analytical and interpretive strategies. In general, these have applied more quantitative procedures and asked more comparatively informed questions. Over the last 40 years (e.g., Johnson 1977), a suite of locational models derived from outside the discipline has served as guides against which different sets of archaeological data could be measured and compared. Yet debates have arisen over the underlying assumptions of such models and whether they are appropriate for understanding the preindustrial past. For that reason, even when comparatively close fits were achieved between heuristically derived expectations and empirical findings, questions regarding equifinality (similar outcomes due to different processes) emerged. More recently, theorybuilding efforts have endeavored to rework and expand these locational models to specifically archaeological contexts with a modicum of success. Continued work in this vein, along with the integration of some of the conceptual strengths from both landscape and settlement pattern approaches are requisite to under-
Sex Differences in Pay standing the complex web of relations that govern human-to-human and human-to-environment interactions across diverse regions over long expanses of time.
3. Looking Forward: The Critical Role of Settlement Studies The key feature and attribute of archaeology is its long temporal panorama on human social formations. Understanding these formations and how they changed, diversified, and varied requires a regional\ populational perspective (as well as other vantages at other scales). Over the last century, the methodological and interpretive toolkits necessary to obtain this broad-scale view have emerged, diverged, and thrived. The emergence of archaeological survey (and settlement pattern and landscape approaches) has been central to the disciplinary growth of archaeology and its increasing ability to address and to contribute to questions of long-term societal change. At the same time, the advent of settlement pattern studies has had a critical role in moving the discipline as a whole from normative to populational frameworks. Yet settlement pattern work has only recently entered the popular notion of this discipline, long wrongly equated with and defined by excavation alone. Likewise, many archaeologists find it difficult to come to grips with a regional perspective that has its strength in (broad) representation at the expense of specific reconstructed detail. Finally, the potential for theoretical contributions and insights from settlement pattern and landscape approaches (and the wealth of data collected by such studies) has only scratched the surface. In many respects, the growth of regional survey and analysis represents one of the most important conceptual developments of twentieth-century archaeology. Yet at the same time, there are still so many mountains (literally and figuratively) to climb. See also: Chronology, Stratigraphy, and Dating Methods in Archaeology; Household Archaeology; Survey and Excavation (Field Methods) in Archaeology
Blanton R E 1978 Monte AlbaT n: Settlement Patterns at the Ancient Zapotec Capital. Academic Press, New York Blanton R E, Kowalewski S A, Feinman G M, Finsten L M 1993 Ancient Mesoamerica: A Comparison of Change in Three Regions, 2nd edn. Cambridge University Press, Cambridge, UK Fish S K, Kowalewski S A (eds.) 1990 The Archaeology of Regions: A Case for Full-coerage Surey. Smithsonian Institution Press, Washington, DC Flannery K V (ed.) 1986 GuilaT Naquitz: Archaic Foraging and Early Agriculture in Oaxaca, Mexico. Academic Press, Orlando, FL Flannery K V, Winter M C 1976 Analyzing household activities. In: Flannery K V (ed.) The Early Mesoamerican Village. Academic Press, New York, pp. 34–47 Fox C 1923 The Archaeology of the Cambridge Region. Cambridge University Press, Cambridge, UK Hill J N 1970 Broken K Pueblo: Prehistoric Social Organization in the American Southwest. University of Arizona Press, Tucson, AZ Johnson G A 1977 Aspects of regional analysis in archaeology. Annual Reiew of Anthropology 6: 479–508 McDonald W A, Rapp G R Jr (eds.) 1972 The Minnesota Messenia Expedition. Reconstructing a Bronze Age Enironment. University of Minnesota Press, Minneapolis, MN Morgan L H 1881 Houses and House Life of the American Aborigines. US Department of Interior, Washington, DC Parsons J R 1972 Archaeological settlement patterns. Annual Reiew of Anthropology 1: 127–50 Parsons J R, Hastings C M, Matos R 1997 Rebuilding the state in highland Peru: Herder-cultivator interaction during the Late Intermediate period in Tarama-Chinchaycocha region. Latin American Antiquity 8: 317–41 Phillips P, Ford J A, Griffin J B 1951 Archaeological Surey in the Lower Mississippi Alluial Valley, 1941–47. Peabody Museum of Archaeology and Ethnology, Cambridge, MA Renfrew C 1972 The Emergence of Ciilisation. The Cyclades and the Aegean in the Third Millennium BC. Methuen, London Sanders W T, Parsons J R, Santley R S 1979 The Basin of Mexico: Ecological Processes in the Eolution of a Ciilization. Academic Press, New York Sherratt A 1996 ‘Settlement patterns’ or ‘landscape studies’? Reconciling reason and romance. Archaeological Dialogues 3: 140–59 Steward J H 1938 Basin Plateau Aboriginal Sociopolitical Groups. Bureau of American Ethnology, Washington, DC Wilk R R, Rathje W L (eds). 1982 Archaeology of the Household: Building a Prehistory of Domestic Life. American Behaioral Scientist 25(6) Willey G R 1953 Prehistoric Settlement Patterns in the ViruT Valley, Peru. Bureau of American Ethnology, Washington, DC
G. M. Feinman
Bibliography Adams R M 1965 Land Behind Baghdad: A History of Settlement on the Diyala Plains. University of Chicago Press, Chicago Adams R M 1981 Heartland of Cities: Sureys of Ancient Settlement and Land Use on the Central Floodplain of the Euphrates. University of Chicago Press, Chicago Ammerman A J 1981 Surveys and archaeological research. Annual Reiew of Anthropology 10: 63–88 Billman B R, Feinman G M (eds.) 1999 Settlement Pattern Studies in the Americas: Fifty Years Since ViruT . Smithsonian Institution Press, Washington, DC
Sex Differences in Pay Differences in pay between men and women remain pervasive in late twentieth-century labor markets, with women’s average earnings consistently below men’s 13941
Sex Differences in Pay even when differences in hours of work are taken into account. This ‘gender pay gap’ may result from unequal pay within jobs, but is also related to the different types of jobs occupied by men and women. Considerable debate has arisen over the extent to which it is evidence of discrimination in labor markets or simply a result of the individual attributes and choices of men and women. The size of the pay gap may also be influenced by the institutional framework of pay bargaining and regulation in different countries. The article reviews a range of factors thought to contribute to the gender pay gap, and the strategies designed to eradicate it. It draws primarily on evidence from affluent industrialized nations.
1. The Gender Pay Gap: Trends and Comparisons While the gap between men’s and women’s average earnings has narrowed in most countries in the late twentieth century, significant inequality remains. Moreover, substantial cross-national differences are apparent. Table 1 illustrates some of these variations in selected countries. To enhance comparison, the data are drawn as far as possible from one source (International Labour Office [ILO] Yearbook of Labour Statistics). Where available, the figures are for hourly earnings, as data for longer periods produce deflated estimates of women’s relative earnings due to the tendency for women to work fewer hours than men. Also, while the most widely available and reliable data are for manufacturing, women’s employment in indus-
trialized nations tends to be concentrated more in service industries. Thus earnings in the ILO’s more inclusive category of ‘non-agricultural activity’ are also reported where available, although these are less satisfactory for cross-national comparison due to variations in the range of industries included. Even within countries, series breaks in data collection may limit the accuracy of comparisons over time. Some caution is therefore needed in interpreting crossnational differences and trends. Nevertheless, a number of broad observations can be made on the basis of the data presented in Table 1. Looking first at trends within countries, an increase in women’s average earnings relative to men’s since 1970 is evident in all of the countries listed except Japan. In many cases this has occurred mainly in the 1970s, although some countries (for example, Belgium, Canada, and the USA) show marked improvement since 1980. In several cases, however, closing of the gender pay gap has slowed in the 1990s. The table also provides some indication of crossnational variation in the size of the gender pay gap. Figures for Canada and the USA are from different sources and not strictly comparable with data from the other countries. Among the others listed, the picture of cross-national variance appears broadly similar for manufacturing and ‘non-agricultural’ data. On the basis of the hourly manufacturing data, which are the most reliable for comparative purposes, Sweden stands out having the narrowest gender pay gap in the mid-1990s. The large gap in Japan reflects the use of monthly rather than hourly earnings, and thus the impact of different working hours between men and
Table 1 Women’s earnings as a percentage of men’si, selected countries and yearsii Non-agricultural industriesiii
Manufacturing
Australia Belgium Canadaiv Denmark France Japanv Netherlands New Zealand Sweden UK USAvi
1970
1980
1990
1995
1970
1980
1990
1995
64 68 — 74 77vii 45 72 66viii 80 58 —
79 70 — 86 77 44 75 71 90 69 —
83 75 — 85 79 41 75 74 89 68 —
85 79 — 85ix 79x 44ix 75 77 90 71 —
65 67 60 72 — 51 74 72viii — 60 —
86 69 64 85 79 54 78 77 — 70 65
88 75 68 83 81 50 78 81 — 76 78
90 79 73 — 81x — 76 81 — 79 81
Sources: International Labour Office (ILO) Yearbook of Labour Statistics. ILO, Geneva, various issues; Statistics Canada, Earnings of Men and Women, Cat No 13-217 Annual, Statistics Canada, Ottawa, various issues; United States Department of Labor, Women’s Bureau Women’s Earnings as a Percent of Men’s, http:\\www.dol.gov\dol\wb\public\wbIpubs\7996.htm (accessed October 10, 1999). i. Percentages are based on hourly earnings except for Canada and Japan. ii. Countries have been selected on the basis of data availability and to provide a span across continents. iii. ILO ‘non-agricultural’ activity groups. Data for Canada and the USA are from different sources and are based on all industries. iv. For Canada, percentages are based on yearly earnings for full-year, full-time workers. The use of yearly wages deflates the figures as women work fewer hours than men on average, even when the comparison is limited to full-time workers. v. For Japan, percentages are based on monthly earnings. vi. For the USA, percentages are based on hourly earnings, but these are for workers paid an hourly wage and are not directly comparable with hourly data for other countries. vii. 1972. viii. 1974. ix. 1992. x. 1993 — Data unavailable.
13942
Sex Differences in Pay women. However, OECD approximations of hourly earnings in manufacturing in Japan still show a very large pay gap (OECD 1988, p. 212), and ILO data indicate that on this issue Japan looks more like other East Asian countries (for example, South Korea and Singapore) than the other countries listed in the table. Overall, the data raise several questions. Why is there a gender pay gap? What accounts for its variation over time and across nations? What can, or should, be done to eradicate it? These questions are addressed in the following sections which examine possible explanations for the gender pay gap, the effect of institutional factors on its variation across countries, and the strategies most likely to assist in its elimination.
2. Explaining the Gender Pay Gap One of the most direct ways in which sex differences in pay have arisen historically is through the establishment of different rates of pay for men and women in wage-setting processes. The assumption that men, as ‘breadwinners,’ should be entitled to higher pay than women underpinned pay determination in many countries until well into the twentieth century. In some countries this was institutionalized in the form of a ‘family wage’ for men. While early challenges to these assumptions were made in several countries, equal pay rates for men and women were usually only achieved where men’s jobs were threatened by cheaper female labor (see, for example, Ryan and Conlon 1989, pp. 99–100). In most countries it was not until the 1960s and 1970s, following the growth of second wave feminism and increasing involvement of women in paid employment, that the inequity of different rates of pay for men and women was responded to directly through policies of equal pay for equal work. However, much of the pay difference between men and women results not from different pay rates in the same jobs, but from the location of men and women in different jobs. This sex segregation of the labor market has proved remarkably resistant to change, and in Walby’s (1988) view, accounts for the persistence of the gender pay gap since World War II in spite of the increasing human capital (i.e., education and labor market experience) of women over that time period (see Sex Segregation at Work). Segregation is significant for pay inequality to the extent that female-dominated sectors of the labor market deliver lower pay. Female-dominated occupations, for example, may be low-skilled or involve skills that have been undervalued. They may also be less likely to provide discretionary payments such as overtime or bonuses. Empirical studies demonstrate that the proportion of women in an occupational group is negatively associated with wage levels, with typically around one-third of the gender pay gap shown to be due to occupational segregation by sex (Treiman and Hartmann 1981). Results for this type
of analysis are highly dependent on the level of disaggregation of occupations applied. Finer disaggregation uncovers greater inequality, as vertical segregation exists within broad occupational groups, with men more likely to be in the higher status, higher paid, jobs. Concentration of women in a small number of occupational groups is also significant for pay outcomes. Grimshaw and Rubery (1997), for example, demonstrate that in seven OECD countries around 60 percent of women are concentrated in just 10 occupational groups out of a total of between 50 and 80, with little change in this situation since the mid-1980s. Moreover, their analysis shows a wage penalty associated with this concentration of employment. Women within these occupations, on average, earn less than the all-occupation average. Another form of labor market division potentially affecting pay differences is the distribution of men and women between firms. Blau (1977) has shown that in the USA women and men are differently distributed among firms, with women more likely to be employed in comparatively low-paying firms. These sorts of divisions may contribute to sex differences in earnings within occupations. A further type of division that has become of increasing significance is that between full-time permanent and other, less regular, types of employment. Part-time work is highly female dominated in most countries, and on average tends to be less well remunerated per hour than full-time work. Waldfogel (1997, p. 215), for example, shows that part-time work carries a wage penalty for the women in her sample. Thus, although part-time positions may assist women to retain careers by facilitating the combination of work and family responsibilities, the concentration of women in this type of work may contribute to the gender pay gap. In sum, sex differences in pay can arise from different pay rates in particular jobs or the distribution of female employment into lower paying jobs. In both cases, pay differences may be due to forms of discrimination or to non-discriminatory influences. Non-discriminatory influences include the individual attributes and choices of men and women, and although these may reflect broader social inequities, they can be distinguished from overt forms of discrimination within the labor market. Both areas are considered below.
2.1 Indiidual Attributes and Choices A wide body of literature has addressed the extent to which the gender pay gap can be explained by individual attributes and choices. Human capital theory suggests that productivity related differences between men and women, such as education, skill, and labor market experience, may account for sex differ13943
Sex Differences in Pay ences in earnings. Typically, analyses decompose earnings differences between men and women into a component that can be ‘explained’ by productivity factors, and another that represents the ‘unexplained’ elements of the gender pay gap that are ascribed to labor market discrimination (Oaxaca 1973). Whether human capital differences themselves might be evidence of social inequality (for example, reflecting differential access to training or restricted choices about labor force attachment) has not been an explicit part of this analytical approach. Numerous studies have been conducted for different countries, and although there are difficulties of measurement and interpretation that complicate such analyses, human capital variables typically account for less than half, and often considerably smaller proportions, of the gender pay gap (Treiman and Hartmann 1981, p. 42). The studies which explain most have been those including detailed estimates of employment experience or labor force attachment (for example, Corcoran and Duncan 1979). Where women’s more intermittent labor force attachment has been captured effectively, as in longitudinal surveys or work histories, the negative effect on wages has been clearly demonstrated (Waldfogel 1997, pp. 210–11). Alongside intermittent labor force attachment, several studies show a wage penalty for women associated with the presence of children. In a regression model utilizing data from 1968–88, Waldfogel (1997, p. 212) identifies a penalty in hourly wages associated with having a child, and a larger penalty for two or more children, even after controlling for actual employment experience and factors such as education. The effects of family and domestic labor responsibilities are thus likely to be cumulative, lowering women’s earnings through reducing employment experience and the capacity to retain career paths. While some may interpret such findings as evidence that a proportion of the gender pay gap is non-discriminatory and simply due to individual choices, others may observe that women’s disproportionate responsibility for family care affects the range of choices available (see Motherhood: Economic Aspects).
2.2 Discrimination in Employment There are several types of labor market discrimination with implications for sex differences in pay. For example, employers may discriminate against women when recruiting and promoting staff, preferring to hire men for higher status positions, and investing more in training and career support for men. This may be ‘statistical discrimination,’ reflecting assumptions that women will be more likely than men to leave work, or have lower levels of commitment to it, once they have family responsibilities. However, it is not clear that women leave jobs more frequently than men (see England 1992, p. 33), hence the rationality of this type 13944
of discrimination is questionable. Less direct forms of discrimination may be the result of customary practice, with long-standing procedures and organizational cultures effectively hindering the advancement of women (see Cockburn 1991). Discrimination may also be evident in the way pay rates are established. While the overt use of different rates of pay for men and women discussed earlier is no longer widespread, female-dominated areas of employment may be relatively underpaid. England’s (1992) analysis provides some evidence for this by showing that the sex composition of an occupation explains between 5 percent and 11 percent of the gender pay gap even when factors such as different types of demands for skill and effort, and industrial and organizational characteristics are controlled for. Undervaluation could be the result of women’s comparatively low bargaining power, and could also reflect gender-biased estimations of the value of skills in female-dominated occupations. England (1992), for example, shows that ‘nurturing’ skills carry a wage penalty, thus suggesting that this type of work is devalued because of its association with typical ‘women’s work.’ Such findings indicate that policies of ‘comparable worth’ or ‘equal pay for work of equal value’—that is, comparisons of dissimilar jobs to produce estimations of job value free of genderbias—have an important role to play.
3. The Role of Institutions While the factors considered thus far contribute to an understanding of the gender pay gap in a general sense, they are frequently less useful in explaining differences between countries. For example, differences between countries in the level of occupational segregation appear to be unrelated to performance on sex differences in pay. In Japan, the combination of low levels of occupational segregation and a large gender pay gap may be explained partly by women’s relative lack of access to seniority track positions in large firms (Anker 1997, p. 335)—that is, by other types of segregation. However, a different type of explanation is necessary to understand why some countries that are highly sex-segregated by occupation (such as Sweden and Australia) have comparatively narrow gender pay gaps. This anomaly suggests the importance of institutional factors in influencing the gender pay gap. Blau and Kahn (1992), for example, point out that the effect of occupational concentration on sex differences in pay will be influenced by the overall wage distribution in any country—where wages are relatively compressed, the effect of segregation on earnings will be minimized. Wage distribution is in turn likely to be a product of the institutional framework for wage bargaining, with more centralized and regulated systems conducive to lower levels of wage dispersion,
Sex Differences in Pay and, therefore, a narrower gender pay gap. Crossnational statistical evidence supports these links, showing association between centralized wage fixation and high levels of pay equity (Whitehouse 1992, Gunderson 1994). This relationship is likely to result not only from wage compression, but also from an enhanced capacity to implement equal pay measures in more centralized systems. Rubery (1994) notes that the degree of centralization has implications for several matters of relevance to pay equity outcomes, including the maintenance of minimum standards and the scope for equal value comparisons. Decentralized pay systems tend to provide more limited scope for comparisons to support equal pay for work of equal value cases, and results may be limited to specific enterprises, or to individuals. Overall, trends in wage bargaining arrangements may be more influential on pay equity outcomes than specific gender equity measures (Rubery 1994, Rubery et al. 1997, Whitehouse 1992). Wage bargaining systems are, however, country-specific, and embedded in national employment systems (Rubery et al. 1997). Translation of institutional structures across countries is therefore unlikely to be a viable proposition.
4. Policy options As the gender pay gap cannot be attributed to one type of cause, a number of strategies will be necessary to attempt its elimination. While addressing pre-market impediments to women’s advancement (such as sexrole stereotypes and their impact on educational and employment choices) is part of this agenda, the main strategies are those designed to remove barriers within the labor market. International conventions such as the ILO’s Equal Remuneration Convention (No. 100) and the United Nations’ Convention on Elimination of all forms of Discrimination Against Women have provided some impetus for action, and most countries have now implemented some form of equal pay legislation or prohibition against discrimination in employment. However, the efficacy of such measures is highly variable. The most direct strategies are legislative provisions requiring the payment of equal pay for equal work, and equal pay for work of equal value. Equal pay for equal work provisions have been most effective where they have included a requirement to remove differential pay rates for men and women in industrial agreements. The rapid narrowing of the gender pay gap in Australia and the UK in the 1970s, for example, demonstrates the effectiveness of measures that removed direct discrimination in collective agreements (Gregory et al. 1989, Zabalza and Tzannatos 1985). Improvements in Sweden in the same decade also reflect the advantage of widespread coverage by collective agreements, and in that case predate the introduction of equal pay legislation. However, while
equal pay requirements may be quite effective initially where they equalize minimum rates between men and women across a comprehensive set of collective agreements, the level of segregation in the labor market means that most provisions for equal pay for equal work apply to only a small proportion of women, as few men and women are in fact doing the same work. Provisions for equal pay for work of equal value (comparable worth) open up a wider field for contestation, but this strategy has proved difficult to implement. Historical bias in the way ‘female’ jobs are valued has not been easy to eradicate as even quite detailed job evaluation methods retain aspects of gender bias and may in fact perpetuate existing hierarchies (Steinberg 1992). Moreover, cases have proved complex and time consuming. Comparable worth does permit reconsideration of the valuation of work, however, and is particularly important given the apparent resistance of patterns of occupational segregation to change. It will be most effective where the scope for comparison is wide, and results apply collectively to types of jobs rather than to individuals. Apart from strategies that deal directly with pay, provisions to prohibit discrimination in employment have been adopted in most countries. Anti-discrimination legislation and equal employment opportunity or affirmative action measures aim to prevent sex discrimination in hiring and placement, and in some cases seek to correct for past sex discrimination by requiring attention to the unequal distribution of men and women within organizations. While it is difficult to estimate the impact of such measures on the gender pay gap, they have no doubt restricted the scope for overt discrimination and contributed to a gradual change in customs and attitudes. Given the impact of family responsibilities on women’s earnings noted earlier, erosion of the gender pay gap will also require the use of strategies to assist in the combination of work and family responsibilities. Ultimately this also requires a more even division of domestic labor between men and women to assist women to retain career progression and employment experience. Paid parental and family leave, and measures to deliver flexibility with job security while combining employment and caring responsibilities will be part of this agenda, although experience thus far suggests that encouragement to fathers to share these types of provisions will also be necessary. Finally, it must be emphasized that the gender pay gap is dependent on a wide range of policy and institutional factors, most of which are not designed with gender equity goals in mind. In particular, wage bargaining arrangements and employment policies may affect the size of the gender pay gap by affecting outcomes such as wage dispersion. Trends away from centralized and regulated forms of pay bargaining are therefore of some concern as they may increase dispersion, which in turn may erode gains made 13945
Sex Differences in Pay through equal pay or comparable worth strategies. In short, the pursuit of pay equity cannot be limited to a single agenda, but requires multiple policy measures, and—like most endeavors attempting significant social change—will require a long period of time to achieve.
Bibliography Anker R 1997 Theories of occupational segregation by sex: An overview. International Labour Reiew 136: 315–39 Blau F 1977 Equal Pay in the Office. DC Heath and Company, Lexington, MA Blau F, Kahn L 1992 The gender earnings gap: Learning from international comparisons. American Economic Reiew, Papers and Proceedings 82: 533–8 Cockburn C 1991 In the Way of Women: Men’s Resistance to Sex Equality in Organizations. ILR Press, Ithaca, NY Corcoran M, Duncan G 1979 Work history, labor force attachments, and earnings differences between the races and sexes. Journal of Human Resources 14: 3–20 England P 1992 Comparable Worth: Theories and Eidence. Aldine de Gruyter, New York Gregory R G, Anstie R, Daly A, Ho V 1989 Women’s pay in Australia, Great Britain and the United States: the role of laws, regulations, and human capital. In: Michael R T, Hartmann H I, O’Farrell B (eds.) Pay Equity: Empirical Inquiries, National Academy Press, Washington, DC, pp. 222–42 Grimshaw D, Rubery J 1997 The Concentration of Women’s Employment and Relatie Occupational Pay: a Statistical Framework for Comparatie Analysis. Labour Market and Social Policy, Occasional Paper No 26, Organisation for Economic Cooperation and Development, Paris Gunderson M 1994 Comparable Worth and Gender Discrimination: an International Perspectie. International Labour Office, Geneva, Switzerland Oaxaca R 1973 Male–female wage differentials in urban labor markets. International Economic Reiew 14: 693–709 OECD 1988 Employment Outlook. Organization for economic cooperation and development, Paris Rubery J 1994 Decentralisation and individualisation: The implications for equal pay. Economies et SocieT teT s 18: 79–97 Rubery J, Bettio F, Fagan C, Maier F, Quack S, Villa P 1997 Payment structures and gender pay differentials: Some societal effects. The International Journal of Human Resource Management 8: 131–49 Ryan E, Conlan A 1989 Gentle Inaders: Australian Women at Work. Penguin, Ringwood, Victoria Steinberg R J 1992 Gendered instructions. Cultural lag and gender bias in the Hay system of job evaluation. Work and Occupations 19: 387–423 Treiman D J, Hartmann H I (eds.) 1981 Women, Work and Wages: Equal Pay for Jobs of Equal Value. National Academy Press, Washington, DC Walby S 1988 Introduction. In: Walby S (ed.) Gender Segregation at Work. Open University Press, Milton Keynes, UK Waldfogel J 1997 The effect of children on women’s wages. American Sociological Reiew 62: 209–17 Whitehouse G 1992 Legislation and labour market gender inequality: An analysis of OECD countries. Work, Employment & Society 6: 65–86.
13946
Zabalza A, Tzannatos Z 1985 Women and Equal Pay: The Effect of Legislation on Female Employment and Wages in Britain. Cambridge University Press, Cambridge, UK
G. Whitehouse
Sex Hormones and their Brain Receptors Nerve cells react not only to electrical and chemical signals from other neurons, but also to a variety of steroid factors arising from outside the brain. Steroids, particularly gonadal steroids, have profound effects on brain development, sexual differentiation, central nervous control of puberty, the stress response, and many functions in the mature brain including cognition and memory (Brinton 1998, McEwen 1997). The brain contains receptors for all five classes of steroid hormones: estrogens, progestins, androgens, glucocorticoids, and mineralocorticoids. Steroid hormone receptors are not uniformly distributed, but occupy specific loci within the brain (McEwen 1999, Shughrue et al. 1997). As expected the hypothalamus has a rich abundance of steroid receptors, particularly receptors for the sex hormones estrogen, progesterone, and testosterone. Remarkably, the limbic system is also a site for steroid action and contains estrogen, androgen, and glucocorticoid receptors. The cerebral cortex is also a site of steroid action and expresses receptors for estrogens and glucocorticoids. Some of these receptors arise from separate genes while others arise from alternative splicing of a single gene. For example the progesterone receptor exists in multiple isoforms derived from the alternative splicing of one gene (Whitfield et al. 1999) whereas estrogen receptors have multiple receptor subtypes (ERα and ERβ) that originate from separate genes, as well as multiple isoforms (Warner et al. 1999, Keightley 1998).
1. Structure of Steroid Receptors All genomic steroid hormone receptors are composed of at least three domains: the N-terminal (or A\B) region, the DNA binding domain containing two zincfingers (region C), and the ligand-binding domain to which the hormone binds (region E). Select steroid receptors, such as the estrogen receptors, have an additional C-terminal (or F) region, which serves to modify receptor function (Tsai and O’Malley 1994, Warner et al. 1999) (Fig. 1A). The amino-terminal A\B region of steroid receptors is highly variable and contains a transactivation domain, which interacts with components of the core transcriptional complex. Region C of these receptors contains a core sequence of 66 amino acids that is the most highly conserved region of the nuclear hormone receptor family. This region folds into two structures that bind zinc in a way characteristic of type II zinc
Sex Hormones and their Brain Receptors
Figure 1 Structural aspects of Steroid Receptors. A. The steroid receptors can be divided into 6 functional domains A, B, C, D, E, and F. The function of each domain is indicated by the solid lines. B. Ribbon representation of the ligand binding domain of estrogen receptor α indicating the ligand binding pocket. Upon ligand binding the receptor dimerizes and then can act as a transcription factor (adapted from Brzozowski et al. 1997). C. Ribbon representation of ERα demonstrating the conformational alterations induced by 17β-estradiol. Upon binding estradiol, helix 12 folds to cover the ligand binding pocket leaving the coactivator binding site free. D. Ribbon representation of ERα demonstrating the conformational alterations induced by raloxifene. Upon binding raloxifene, helix 12 is displaced obscuring the coactivator binding site (adapted from Brzozowski et al. 1997)
fingers. These two type II zinc clusters are believed to be involved in specific DNA binding, thus giving the C regions its designation as the DNA binding domain. In addition, the C region plays a role in receptor dimerization (Fig. 1A and B). The D region contains a hinge portion that may be conformationally altered upon ligand binding. The E\F-region is functionally complex since it contains regions important for ligand binding and receptor dimerization, nuclear localization, and interactions with transcriptional co-
activators and corepressors (Horwitz et al. 1996). It has been observed that a truncated receptor lacking regions A\B and C can bind hormone with an affinity similar to that of the complete receptor suggesting that region E is an independently folded structural domain. Amino acids within region E that are conserved amongst all members of the nuclear receptor family are thought to be responsible for the conformation of a hydrophobic pocket necessary for steroid binding whereas the non-conserved amino acids may be 13947
Sex Hormones and their Brain Receptors important for ligand specificity. The conserved amino acid sequences within the ligand binding domain of the nuclear receptors make up a common structural motif, which is composed of 11–12 individual α helices with helix 12 being absent in some members of the superfamily of steroid receptors. Although domain F is not required for transcriptional response to hormone, it appears to be important for modulating receptor function in response to different ligands. Of prime importance in estrogen receptor function is helix 12 (Fig. 1C), found in the most carboxyl region of the ligand-binding domain, which is thought to be responsible for transcriptional activation function (AF-2) activity (Warner et al. 1999). This region realigns over the ligand-binding pocket when associated with agonists, but takes on a different conformation with antagonists (Fig. 1). Such conformational changes are thought to affect interactions with coactivators, since helix 12 is required for interaction with several of the coactivator proteins and mutations that are known to abolish AF-2 transcriptional activity also abolish the interaction of the nuclear receptors with several of their associated proteins (McKenna et al. 1999, Warner et al. 1999). The ability of different ligands to determine the conformational state of the receptor has profound consequences on the ability of the steroid receptor to drive gene transcription.
2. Classical Mechanism of Sex Steroid Action Classically, sex steroids exert their effects by binding to intracellular receptors and modifying gene expression (Tsai and O’Malley 1994). Steroid receptors are ligand-activated transcription factors that modulate specific gene expression. In the presence of hormone, two receptor monomers dimerize and bind to short DNA sequences located in the vicinity of hormone-regulated genes (Fig. 2). These specific DNA sequences, or hormone response elements (HRE), contain palindromic or directly repeating half-sites (Tsai and O’Malley 1994). Since an HRE exerts its action irrespective of its orientation and when positioned at a variable distance upstream or downstream from a variety of promoters (Tsai and O’Malley 1994), it is an enhancer. Steroid receptors represent inducible enhancer factors that contain regions important for hormone binding, HRE recognition, and activation of transcription. The mechanism of transactivation by nuclear receptors has recently achieved further complexity by the discovery of an increasing number of coregulators. Coregulator proteins can modulate the efficacy and direction of steroid-induced gene transcription. Coregulators include coactivators ERAP160 and 140, and RIP160, 140, and 80 which were biochemically identified by their ability to specifically interact with the hormone binding domain of the receptor in a 13948
ligand-dependent manner (McKenna et al. 1999) (Fig. 2). For example, the interaction between estrogen receptors and coregulators was promoted by estradiol whereas antiestrogens did not promote this interaction. Further studies led to the identification of many other co-activators including glucocorticoid receptor interacting protein 1 (GRIP 1), steroid receptor coactivator 1 (SRC 1), and transcriptional intermediary factor 2 (TIF 2). When cotransfected with nuclear receptors, including estrogen receptor, these coactivators are capable of augmenting liganddependent transactivation. In addition, the phosphoCREB-binding protein (CBP) and the related p300 have been demonstrated to be estrogen receptorassociated proteins and involved in ligand-dependent transactivation. It has also been shown that coactivator RIP140 interacts with the ER in the presence of estrogen, and this interaction enhances transcriptional activity between 4- to 100-fold, depending on promoter context.
3. Rapid Signaling Effects of Sex Steroids Although steroids induce many of their effects in the brain through activation of genomic receptors, nontranscriptional actions of estrogens, progestins, glucocorticoids, and aldosterone have been observed in a variety of tissues, including the brain (Brinton 1993, 1994, Watson et al. 1995, Pappas et al. 1995). These nontranscriptional actions are characterized by shortterm effects that range from milliseconds to several minutes. Additionally, these effects still occur in the presence of actinomycin D or cyclohexamide, known transcriptional blockers. The first suggestion of the non-transcriptional actions of steroids came when it was observed that progesterone induced rapid anesthetic and sedative actions when injected subcutaneously (Brinton 1994). Then in the 1960s and 1970s numerous studies suggested that estrogen modulated the electrical activity of a variety of nerve cells (Foy et al. 1999). Subsequent studies utilizing various techniques and preparations have shown that these effects occur too rapidly to be genomic in nature (Foy and Teyler 1983). Morphological studies of neuronal development conducted in itro have shown that estrogenic steroids exert a growth-promoting, neurotrophic effect on hippocampal and cortical neurons via a mechanism that requires activation of NMDA receptors (Brinton et al. 1997). In io studies have revealed a proliferation of dendritic spines following 17β-estradiol treatment that can be prevented by blockade of NMDA receptorchannels, though not by AMPA or muscarinic receptor antagonists (Woolley 1999). Other reports have provided evidence that chronic 17β-estradiol treatment increases the number of NMDA receptor binding sites, and NMDA receptor-mediated responses (Woolley 1999).
Sex Hormones and their Brain Receptors
Figure 2 Classical mechanism of steroid action. Upon binding of steroid (S) to an inactive steroid receptor, the receptor is activated and two receptor-ligand monomers dimerize and bind to the hormone response element (HRE). Coactivators such as RIP140, CBP, and SRC-1 bind to and link the hormone receptor with the general transcription factors (GTF) and RNA polymerase of the transcription machinery to alter transcription
Recent studies suggest a direct link between the estrogen receptor and the mitogen-activated protein kinase (MAPK) signaling cascade (Singh et al. 2000). MAPKs are a family of serine-threonine kinases that
become phosphorylated and activated in response to a variety of cell growth signals. In neuronal cells, estrogen resulted in neuroprotection that was associated with a rapid activation of the MAPK 13949
Sex Hormones and their Brain Receptors
Figure 3 Rapid signaling effects of sex steroids. Estradiol, acting through a possible membrane bound or cytoplasmic receptor, causes the transient activation of c-src-tyrosine kinase. Activated c-src can potentiate the glutamate response through the NMDA receptor. Additionally, c-src can cause the phosphorylation of p21 (ras)-guanine nucleotide activating protein leading to the activation of MAP Kinase, which has many downstream effects on cell survival and growth
signaling pathway (Singh et al. 2000). These neuroprotective effects, which occurred within 5 minutes of estrogen exposure are thought to occur through the transient activation of c-src-tyrosine kinases and tyrosine phosphorylation of p21(ras)-guanine nucleotide activating protein (Fig. 3). Additionally, the potentiation of the NMDA receptor mediated neuronal response by estrogen is thought to be mediated by c-src-tyrosine kinases (Bi et al. 2000). It is not yet clear whether these effects require the classical estrogen receptor (ERα\β) or if there is an as of yet undiscovered membrane receptor that mediates these rapid effects of estrogen (Razandi et al. 1999). Thus, while steroid receptors are ligand-induced transcriptional enhancers, they are also activators of intracellular signaling pathways that can profoundly influence neuronal function and survival. Challenges remain in our understanding of steroid action and sites of action in brain. Remarkably, our knowledge of genomic sites of steroid action has 13950
greatly expanded in the past decade while the membrane sites of steroid action remain a challenge to fully characterize. Our understanding of steroid effects in brain has led to the expansion of the role of steroid receptors beyond sexual differentiation and reproductive neuroendrocrine function to regulators of nearly every aspect of brain function including cognition. The full range of mechanisms by which steroids influence such a wide array of brain functions remains to be discovered. See also: Gender Differences in Personality and Social Behavior; Gender-related Development
Bibliography Bi R, Broutman G, Foy M R, Thompson R F, Baudry M 2000 The tyrosine kinase and mitogen-activated protein kinase pathways mediate multiple effects of estrogen in hippocampus.
Sex Offenders, Clinical Psychology of Proceedings of the National Academy of Science, USA 97: 3602–7 Brinton R D 1993 17 β-estradiol induction of filopodial growth in cultured hippocampal neurons within minutes of exposure. Molecular Cell Neuroscience 4: 36–46 Brinton R D 1994 The neurosteroid 3 alpha-hydroxy-5 alphapregnan-20-one induces cytoarchitectural regression in cultured fetal hippocampal neurons. Journal of Neuroscience 14: 2763–74 Brinton R D 1998 Estrogens and Alzheimer’s disease. In: Marwah J, Teitelbaum H (eds.) Adances in Neurodegeneratie Disorders, Vol. 2: Alzheimer’s and Aging. Prominent Press, Scottsdale, AZ, pp. 99 Brinton R D, Proffitt P, Tran J, Luu R 1997 Equilin, a principal component of the estrogen replacement therapy premarin, increases the growth of cortical neurons via an NMDA receptor-dependent mechanism. Experimental Neurology 147: 211–20 Foy M R, Teyler T J 1983 17-alpha-Estradiol and 17-betaestradiol in hippocampus. Brain Research Bulletin 10: 735–9 Foy M R, Xu J, Xie X, Brinton R D, Thompson R F, Berger T W 1999 17 β-estradiol enhances NmDA receptor mediated EPSPs and long-term potentiation in hippocamfal CA1 cells. Journal of Neurophysiology 81: 925–8 Horwitz K B, Jackson T A, Bain D L, Richer J K, Takimoto G S, Tung L 1996 Nuclear receptor coactivators and corepressors. Molecular Endocrinology 10: 1167–77 Keightley M C 1998 Steroid receptor isoforms: Exception or rule? Molecular & Cellular Endocrinology 137: 1–5 McEwen B S 1997 Hormones as regulators of brain development: Life-long effects related to health and disease. Acta Paediatrica 422 (Suppl.): 41–4 McEwen B S 1999 Clinical review 108: The molecular and neuroanatomical basis for estrogen effects in the central nervous system. Journal of Clinical Endocrinology & Metabolism 84: 1790–7 McKenna N J, Lanz R B, O’Malley B W 1999 Nuclear receptor coregulators: Cellular and molecular biology. Endocrine Reiew 20: 321–44 Pappas T C, Gametchu B, Watson C S 1995 Membrane estrogen receptors identified by multiple antibody labeling and impeded-ligand binding. FASEB 9: 404–10 Razandi M, Pedram A, Greene G L, Levin E R 1999 Cell membrane and nuclear estrogen receptors (ERs) originate from a single transcript: Studies of ERalpha and ERbeta expressed in Chinese hamster ovary cells. Molecular Endocrinology 13: 307–19 Shughrue P J, Lane M V, Merchenthaler I 1997 Comparative distribution of estrogen receptor-alpha and -beta mRNA in the rat central nervous system. Journal of Comprehensie Neurology 388: 507–25 Singh M, Setalo G J, Guan X, Frail D E, Toran-Allerand C D 2000 Estrogen-induced activation of the mitogen-activated protein kinase cascade in the cerebral cortex of estrogen receptor-alpha knock-out mice. Journal of Neuroscience 20: 1694–1700 Tsai M J, O’Malley B W 1994 Molecular mechanisms of action of steroid\thyroid receptor superfamily members. Annual Reiew of Biochemistry 63: 451–86 Warner M, Nilsson S, Gustafsson J A 1999 The estrogen receptor family [Review] [50 refs]. Current Opinions in Obstetrics & Gynecology 11: 249–54 Watson C S, Pappas T C, Gametchu B 1995 The other estrogen receptor in the plasma membrane: Implications for the actions
of environmental estrogens. Enironmental Health Perspecties 103 (Suppl.): 41–50 Whitfield G K, Jurutka P W, Haussler C A, Haussler M R 1999 Steroid hormone receptors: Evolution, ligands, and molecular basis of biologic function. Journal of Cell Biochemistry 32– 3 (Suppl.): 110–22 Woolley C S 1999 Electrophysiological and cellular effects of estrogen on neuronal function. Critical Reiew of Neurobiology 13: 1–20
R. D. Brinton and J. T. Nilsen
Sex Offenders, Clinical Psychology of The clinical psychology of sex offenders involves assessment, treatment, and prevention. Clinical assessment involves the careful description of the problem and an estimation about the risk for recidivism, or re-offense. Clinical treatment involves interventions to reduce the risk of recidivism. Clinical prevention involves interventions before a person becomes a sex offender to reduce the risk of recidivism. There is much more research on assessment and treatment of sex offenders than there is on prevention of sex offending.
1. Who are Sex Offenders? A sex offender is anyone who has forced another person to engage in sexual contact against their will. Sex offending may or may not involve physical force. For example, a person can use psychological force to get another person to have sex, as in the case of a power differential between two persons (e.g., employer–employee). However, coercive sex that involves physical force is more likely to be regarded as sex offending than coercive sex that does not. Another issue is the ability of the victim to give consent. Minors and developmentally disabled persons are usually considered to lack the ability to give consent. A person’s ability to give consent may also become impaired, such as in the case of substance abuse. Thus, a person engaging in sex with a person who is unable to consent or whose ability to give consent is impaired might be considered a sex offender. The term sex offender usually is associated with a person who has been apprehended in a legal context for their coercive sexual behavior. Nevertheless, not all persons who engage in sexually coercive behaviors are caught. Thus, two persons could engage in the same sexually coercive behavior, but the one who is caught would be considered a sex offender and the one who is not caught would not. Moreover, legal statutes 13951
Sex Offenders, Clinical Psychology of vary from jurisdiction to jurisdiction. For example, sexual penetration may be required to be considered a sex offender in some jurisdictions, while other jurisdictions may have broader definitions that include nonpenetrative forms of sexual contact. Legal jargon may obscure the meaning and impact of sexually coercive behavior, as well. For example, terms such as ‘indecent liberties’ and ‘indecent assault’ may both refer to rape, as defined as forced sexual intercourse. However, indecent liberties or indecent assault charges carry very different legal meanings and punishments than does rape. Similarly, rape committed by a stranger is more likely to be prosecuted than rape committed by someone known to the victim. Nevertheless, the behavior in both instances is rape. Because of these inconsistencies and vagaries in legal definitions of sex offending, the current article will adopt a broad approach to sexual aggression, focusing on coercive sexual behavior whether or not the perpetrator has been apprehended. There is often disagreement between perpetrators and victims on whether sexual aggression actually occurred, or the seriousness of it. A comprehensive discussion of the veracity of perpetrator and victim reports is beyond the scope of this article. However, current definitions of sexual aggression give more weight to victims’ perceptions of the occurrence of sexual aggression, given perpetrators’ tendencies to defend and minimize aggressive behavior and the inherent power differential between victims and offenders. It is usually disadvantageous to bring the negative attention to oneself that accompanies accusing someone of sexually victimizing them. In courtroom settings, victims’ personal and sexual histories are examined in cases of sexual abuse in a manner that is unparalleled for other types of offenses. Nevertheless, child custody disputes appear to be one context in which there may be some risk, albeit small, for false accusations of sexual abuse on the part of a parent seeking to disqualify the parenting fitness of the other parent. The Diagnostic and Statistical Manual (DSM ) of the American Psychiatric Association includes several sexual disorders, or paraphilias, that sex offenders may have. These include fetishism (sexual arousal associated with nonliving objects), transvestic fetishism (sexual arousal associated with the act of crossdressing), voyeurism (observing unsuspecting nude individuals in the process of disrobing or engaging in sexual activity), exhibitionism (public genital exposure), frotteurism (touching a nonconsenting person’s genitalia or breasts, or rubbing one’s genitals against a nonconsenting person’s thighs or buttocks), pedophilia (sexual attraction or contact involving children), sexual masochism (sexual arousal associated with suffering), and sexual sadism (sexual arousal associated with inflicting suffering). Although the fetish disorders do not involve contact with an actual person, these disorders can result in arrest when a fetishist 13952
steals the fetish items (e.g., undergarments). Rape often does not involve sexual arousal directly associated with suffering and is more commonly classified in DSM as a component of antisocial personality disorder than as a sexual disorder. Such a classification appears justified, in that many rapists engage in both sexual and nonsexual forms of aggression and other rule-violating behaviors. The focus of this article will be on rape and child molestation because there exists more research on these topics than on any of the other types of sexual offenses. The emphasis in the literature on these two sex offenses reflects the fact that these may create more harm to victims than the other disorders.
2. History The clinical study of sex offenders had its beginnings in research on sexuality. Some of the earliest scholarly writings on sexual deviance were detailed case studies by psychiatrist Krafft-Ebing (1965\1886). Krafft-Ebing postulated that all sexual deviations were the result of masturbation. The case study method focused exclusively on highly disturbed individuals without matched control cases, which precluded objective considerations of etiology (Rosen and Beck 1988). Kinsey and colleagues (1948) conducted large-scale normative surveys of sexual behavior, resulting in major works on male and female sexuality. Because adult–child sexual contact was relatively common in his samples, Kinsey underplayed the negative effects of such behavior (Rosen and Beck 1988). Work on sexual deviance continued at the Kinsey Institute after Kinsey’s death (Gebhard et al. 1965). A major advance in the assessment of sexual arousal was the development of the laboratory measures of penile response by Freund (1963). Freund’s measure, known as the penile plethysmograph, involved an inflatable tube constructed from a condom, by which penile volume change in response to erotic stimuli (e.g., nude photographs) was measured by air displacement (Rosen and Beck 1988). Less intrusive penile measures to assess circumference changes were later developed (Bancroft et al. 1966; Barlow et al. 1970). Among behaviorists, penile response to deviant stimuli (e.g., children, rape) became virtually a gold standard of measurement for sexual deviance (e.g., Abel et al. 1977). The emphasis of behaviorists on the role of sexual arousal in sex offending came under criticism from feminist theories. Rape was conceptualized as a ‘pseudosexual’ act of anger and violence rather than as a sexual disorder. This approach was popularized by Groth (1979). More recent conceptualizations have incorporated sexual, affective, cognitive, and developmental motivational components of sex offending (Hall et al. 1991).
Sex Offenders, Clinical Psychology of
3. Risk Factors for Sex Offending One major risk factor for being a sex offender is being male. Less than 1 percent of females perpetrate any form of sexual aggression, whereas the percentage of men who are rapists ranges from 7 to 10 percent, and the percentage of men who admit to sexual contact with children is 3 to 4 percent. Evolutionary psychologists suggest that mating with multiple partners provides males with a reproductive advantage. Thus, sexually aggressive behavior may be a by-product of evolutionary history. Nevertheless, most men are not sexually aggressive. Thus, being a male may be a risk factor for being sexually aggressive, but certain environmental conditions may be necessary for someone to become sexually aggressive. For example, aggressive behavior, including sexually aggressive behavior, is accepted and socialized among males more than females in most societies. Another risk factor for males becoming sexually aggressive, particularly against children, is personal sexual victimization. Boys who have been sexually victimized are more likely to engage in sexualized behaviors (e.g., sexual touching) immediately following sexual victimization than boys who have not. Moreover, in recent studies, over half of adult sex offenders have reported being sexually abused during childhood. However, the sexual abuse of males is not invariably associated with becoming sexually abusive, as the majority of males who are sexually abused do not become abusers. The single best predictor of sex offending is past sex offending. Persons who have sexually offended multiple times in the past are more likely to sexually offend in the future than those who have limited histories of sex offending. Recidivism rates for child molesters and rapists are similar at 25 to 32 percent. Persons who have committed multiple sex offenses have broken the barriers against offending and have lowered the threshold for re-offending by developing a pattern of behavior. Thus, sex offenders having multiple offenses are the group that poses the greatest risk to community safety. Moreover, the predictive utility of past sex offending suggests that interventions with first-time offenders and interventions to prevent sex offending from starting are critical to avoid the establishment of an ingrained pattern of behavior.
4. Motiational Factors for Sexual Offending What causes men to become sex offenders? One common explanation has been the sexual preference hypothesis that sex offenders are more sexually aroused by coercive than by consenting sexual activity. Learning theories posit that sexual arousal to coercive sexual activity is conditioned (see Classical Conditioning and Clinical Psychology). For example, a pedophile may have had sexually arousing experiences
during childhood with peers and may never have outgrown these experiences. The fusion of violent and sexual images in the media may influence some men to associate sexual arousal with aggressive activity. The sexual preference hypothesis appears to be most appropriate for some sex offenders against children, particularly those who offend against boys. For many men who molest children, their sexual arousal to children exceeds their sexual arousal to adults, as assessed by genital measures. However, some child molesters, particularly incestuous offenders, may be more sexually aroused by adults than by children. The sexual preference hypothesis is less applicable to men who rape adults. Most rapists’ sexual arousal is greatest in response to adult consenting sexual activity and less in response to sexual activity that involves physical force. However, a minority of rapists do exhibit a sexual preference for rape over consenting adult sexual activity. These rapists would be considered sexual sadists. Another common explanation of sex offending has been anger and power. Some feminist scholars have gone as far as to say that sex offending is a ‘pseudosexual’ act and that it is not about sex. However, recent feminist scholarship recognizes the sexual and aggressive aspects of sex offending. If sex offending is a violent, rather than a sexual, act, why does the perpetrator not simply assault the victim rather than sexually assaulting the victim? Anger and power are most relevant in explaining the rape of women by males. Adversarial relationships with women may cause some men to attempt to enforce their sense of superiority over women via rape. Some child molesters may also have anger and power motives for their offending. However, depression is a more common precursor to child molesting than anger. One source of depression may be perceived or actual social incompetence in peer relationships. Sexual contact with children may represent a maladaptive method of coping with this depression. Excuses to justify sexually aggressive behavior may be the motivation for some forms of sex offending. Such excuses deny or minimize the impact of sexual aggression and are known as cognitive distortions. Cognitive distortions are common in acquaintance rape situations and in incest. An acquaintance rapist may contend that rape cannot occur in the context of a relationship or that the existence of a relationship justifies any type of sexual contact that the rapist desires. Common cognitive distortions among incest offenders are that sexual contact with their child is a form of affection, or sexual education, or that it merely amounts to horseplay. Thus, the motivation for a person employing cognitive distortions is that the sexual contact is ‘normal’ and is not aggressive. It is likely that these different types of motivation for sex offending may interact for many sex offenders. Sexual arousal to coercive sexual behavior, emotional problems, and cognitive distortions may coexist for 13953
Sex Offenders, Clinical Psychology of some sex offenders, and a singular explanation of the basis of the problem may be inadequate. Nevertheless, complex explanations of sex offending are less likely to result in treatment interventions than are explanations that identify the major motivational issues.
5. Treatment Interentions for Sex Offenders The most common forms of treatment for sex offenders have been behavioral, cognitive-behavioral, and psychohormonal interventions. Behavioral methods have typically involved interventions to reduce sexual arousal to inappropriate stimuli (e.g., children, rape) (see Behaior Therapy: Psychiatric Aspects). Aversive conditioning, which pairs sexual arousal to the inappropriate stimulus with an aversive stimulus (e.g., foul odor, thoughts about punishment for sex offending) is perhaps the most widely used behavioral treatment with sex offenders. Cognitive-behavioral methods focus on the influence of cognitive factors, including individual beliefs, standards, and values, on sex-offending behavior (see Cognitie Therapy). Common cognitive-behavioral treatments often involve cognitive restructuring, empathy enhancement, social skills training, and self-control techniques. Relapse prevention, which has been adapted from the treatment of addictive behaviors, has gained relatively wide acceptance in the treatment of sex offenders. This cognitive-behavior method involves self-control via anticipating and coping with situations following treatment that create high risk for relapse (e.g., social contact with potential victims) (see Relapse Preention Training). Psychohormonal treatments involve the use of antiandrogen drugs to reduce the production of testosterone and other androgens. Such androgen reduction suppresses sexual arousal. The antiandrogen that has been most commonly used is medroxyprogesterone (Depo Provera). Evidence across recent studies of rapists, child molesters, and exhibitionists suggests that cognitivebehavioral and psychohormonal interventions may be more effective than no treatment or behavioral treatments. Recidivism rates among sex offenders, as measured by arrests for sex offenses, are 25–32 percent. Sex offender recidivism rates in studies of behavioral treatments have been approximately 40 percent. Both cognitive-behavioral and psychohormonal treatments have yielded recidivism rates of 13 percent. Why is there such a high rate of recidivism in studies of behavioral treatments? It could be contended that behavioral treatments are too narrowly focused on sexual arousal and ignore other motivational issues (e.g., emotional, cognitive). However, psychohormonal treatments also primarily focus on sexual arousal reduction and result in relatively low recidivism rates. Thus, it is possible that any positive effects of behavioral treatments may ‘wear off’ over time. It is also possible that the sex offenders in the 13954
behavioral treatment studies were at higher risk for recidivism, although the recidivism rates for all forms of treatment are based on both inpatient and outpatient samples. Thus, there is support for cognitivebehavioral and psychohormonal treatments being the most effective treatments with sex offenders. Nevertheless, the evidence of this treatment effectiveness is somewhat limited. A major limitation of psychohormonal treatments is compliance. These treatments usually involve intramuscular injections and are effective only as long as they are complied with. Sex offender refusal and dropout rates with psychohormonal treatments range from 50 to 66 percent. Moreover, a study directly comparing the relative effectiveness of cognitivebehavioral vs. antiandrogen treatments within the same population of sex offenders is needed. Such a study could clarify whether the effective suppression of sexual arousal achieved by antiandrogen treatments is sufficient to reduce recidivism or whether the more comprehensive aspects of cognitive-behavioral treatments offer necessary adjuncts. Also unknown is whether a combined cognitive behavioral plus antiandrogen approach would be superior to either approach alone. An encouraging development is evidence of effectiveness of cognitive-behavioral methods in reducing recidivism among adolescent sex offenders. Interventions with adolescents may prevent the development of a history of sex offending that is associated with reoffending. However, there are extremely few outcome studies of adolescent sex offender treatment. Most of the available treatment research on sex offenders has been conducted with European American populations. It is unknown whether these treatments are equally effective with other groups. For example, cognitive-behavioral methods are individually based. However, individually-based interventions may not be as effective in cultures in which there is a strong group orientation. Thus, individual change might be offset by group norms. For example, in some patriarchal groups, misogynous attitudes may be normative and sexual aggression permissible. Conversely, there may be protective cultural factors that could be mobilized to prevent sex offending. Thus, the context in which interventions are used needs to be examined in future research.
6. Preention There have been virtually no prevention studies that have examined perpetration of sex offending as an outcome measure. Yet the costs to society of sex offending, in terms of damage to and rehabilitation of victims and of incarceration and rehabilitation of offenders, implore us to seek ways to prevent the problem before it occurs. The recidivism rates of the most effective treatments are at 13 percent, which is
Sex Preferences in Western Societies significantly better than other forms of treatment or no treatment. Yet, it could be contended that a 13 percent recidivism rate is unacceptably high. Perhaps the effective components of cognitive-behavioral interventions could be adapted for proactive use in prevention programs. Most sexual abuse prevention programs have focused on potential victims for interventions. However, perpetrators, not victims, are responsible for sex offenses. A completely effective sex offense prevention program would eliminate the need for victim programs. Prevention programs for potential victims are critically important in terms of empowerment. However, there has been a disproportionate amount of attention in sexual abuse prevention to victims. Relatively simple modifications of existing victim prevention programs could potentially go a long way toward preventing perpetration of sex offenses. For example, most children’s sexual abuse prevention programs present the concept of ‘bad touch,’ which usually is instigated by someone else. Missing from most of these programs, however, is the idea that not only should someone else not ‘bad touch’ you, but you also should not ‘bad touch’ someone else. Such an intervention could reach many potential perpetrators who would not otherwise receive this information. The impact of efforts to prevent sex offense perpetration is unknown. Thus, there is a great need for the development and evaluation of interventions to prevent sex offending. See also: Childhood Sexual Abuse and Risk for Adult Psychopathology; Rape and Sexual Coercion; Regulation: Sexual Behavior; Sexual Harassment: Legal Perspectives; Sexual Harassment: Social and Psychological Issues; Sexual Perversions (Paraphilias); Treatment of the Repetitive Criminal Sex Offender: United States
Gebhard P H, Gagnon J H, Pomeroy W B, Christenson C V 1965 Sex Offenders: An Analysis of Types. Harper & Row, New York Groth A N 1979 Men Who Rape: The Psychology of the Offender. Plenum, New York Hall G C N 1995 Sexual offender recidivism revisited: A metaanalysis of recent treatment studies. Journal of Consulting and Clinical Psychology 63: 802–9 Hall G C N 1996 Theory-based Assessment, Treatment, and Preention of Sexual Aggression. Oxford University Press, New York Hall G C N, Andersen B L, Aarestad S, Barongan C 2000 Sexual dysfunction and deviation. In: Hersen H, Bellack A S (eds.) Psychopathology in Adulthood, 2nd edn. Allyn and Bacon, Boston Hall G C N, Barongan C 1997 Prevention of sexual aggression: Sociocultural risk and protective factors. American Psychologist 52: 5–14 Hall G C N, Hirschman R, Beutler L E (eds.) 1991 Special section: Theories of sexual aggression. Journal of Consulting and Clinical Psychology 59: 619–81 Heilbrun K, Nezu C M, Keeney M, Chung S, Wasserman A L 1998 Sexual offending: Linking assessment, intervention, and decision making. Psychology, Public Policy, and Law 4: 138–74 Kinsey A C, Pomeroy W B, Martin C E 1948 Sexual Behaior in the Human Male. W B Saunders, Philadelphia Krafft-Ebing R von 1965 Psychopathia Sexualis. Putnam, New York (Original work published in 1886) Marshall W L, Fernandez Y M, Hudson S M, Ward T 1998 Sourcebook of Treatment Programs for Sexual Offenders. Plenum Press, New York Prentky R A, Knight R A, Lee A F S 1997 Risk factors associated with recidivism among extrafamilial child molesters. Journal of Consulting and Clinical Psychology 65: 141–9 Rosen R C, Beck J C 1988 Patterns of Sexual Arousal: Psychophysiological Processes and Clinical Applications. Guilford, New York
G. C. N. Hall
Sex Preferences in Western Societies Bibliography Abel G G, Barlow D H, Blanchard E B, Guild D 1977 The components of rapists’ sexual arousal. Archies of General Psychiatry 34: 895–903 Bancroft J H, Gwynne Jones H E, Pullan B P 1966 A simple transducer for measuring penile erection with comments on its use in the treatment of sexual disorder. Behaiour Research and Therapy 4: 239–41 Barbaree H E, Marshall W L, Hudson S M 1993 The Juenile Sex Offender. Guilford Press, New York Barlow D, Becker R, Leitenberg H, Agras W W 1970 A mechanical strain gauge for recording penile circumference change. Journal of Applied Behaior Analysis 3: 73–6 Buss D M, Malamuth N M 1996 Sex, Power, Conflict: Eolutionary and Feminist Perspecties. Oxford University Press, New York Freund K 1963 A laboratory method for diagnosing predominance of homo- or heteroerotic interest in the male. Behaiour Research and Therapy 1: 85–93
Since 1950, a great number of authors working in the field of anthropology, demography, sociology, and psychology in North America and Europe have tried to determine parental sex preferences. The authors have used various approaches and samples, but in general results are relatively consistent. In this entry, the four methods most frequently used in this field will be reviewed: (a) first-child preference, (b) only-child preference, (c) sex preference for the next child, and (d) the parity-progression ratio technique.
1. First-child Preference Many authors have tried to determine the sex preference of adult individuals with regard to their firstborn child with a question such as ‘For your first child would you prefer a girl or a boy?’ More than 30 studies 13955
Sex Preferences in Western Societies conducted in North America, for the most part with samples of college students or respondents recruited for fertility studies, have indicated that most women and men prefer a boy rather than a girl for their firstborn child. These findings give the strong impression that the preference for a boy as a firstborn has been universal among people in Western societies since 1950. Since these results are based on a hypothetical situation that might not happen in the respondent’s life, Steinbacher and Gilroy (1985) have argued that an assessment of sex preference when the women are pregnant would be more valid. A review of the literature in English and French found 16 empirical investigations in which it was possible to identify the maternal sex preference of pregnant women. In eight of these studies, the information concerning the expectant father’s preference was also available, either from the fathers themselves (in three studies) or from their pregnant wives (in five studies). These studies clearly indicate that first-time pregnant women more often prefer a girl than a boy, especially after 1981, when a preference for a girl is shown in six out of seven studies. The data concerning expectant fathers are different; in fact, in seven out of the eight studies, men preferred a boy rather than a girl for a first child. Concerning the variables associated with the sex preference for a first child, two studies have presented data on pregnant women (Uddenberg et al. 1971, Steinbacher and Gilroy 1985). First, Uddenberg et al. (1971) found that women who grew up with only female siblings more often preferred a son first in a small sample of 81 first-time pregnant women in Sweden. (Some other studies showed the opposite: namely, that the more sisters a woman has, the greater her preference for a girl.) In addition, women who desire a girl are psychologically more autonomous than women who desire a boy. Interestingly, Uddenberg et al. (1971) also found no significant difference between the age or the social classes of the women who preferred a girl or a boy, or who expressed no specific preference. Second, the study by Steinbacher and Gilroy (1985), which dealt with 140 first-time pregnant women in the USA, reported that older women more often chose the no-preference category, and that those who agreed strongly with the women’s movement preferred a girl rather than a boy. Otherwise, they did not find any significant relationship in variables such as race, income, marital status, or religion. Interestingly, in this type of literature, the nopreference percentage has varied from 25 percent to 59 percent since 1981. An important point is to determine whether in fact many of the women who claim to have no preference use this answer to hide a preference, as has been suggested by Pharis and Manosevitz (1980). Marleau et al. (1996) consequently checked the validity of this traditional sex preference question by compar13956
ing the answers to that question with the answers to a ‘feminine\masculine’ scale which assessed how pregnant women imagined the sex of their future baby. It was shown that women having expressed no sex preference on a direct question had in fact no explicit image of their baby as male or female on the ‘feminine\masculine’ scale. This experiment seems to confirm the validity of the classical direct question.
2. Only-child Preference In reviewing the literature relative to only-child preference, Marleau and Maheu (1998) identified many studies in which it was possible to identify women’s and\or men’s sex preference(s). Two subgroups of studies are found. The first one consists of 11 studies in which the subjects were forced to select the sex of a child on the basis of the hypothetical situation that they would have only one child in their whole lives. In the second subgroup, five studies were designed to elicit the number of children the subjects desired in their lives. Some subjects declared that they wanted only one child, and it was possible from these answers to determine the sex of this child for further analysis. The results of these two subgroups of studies were collapsed together for the final analysis. It should be noted that all of these studies were made in the USA between 1951 and 1991, and that in the majority of studies the samples consisted of college or university students. The mains results indicate that women, in general, prefer a boy to a girl for an only child. However, in three of the five most recently published studies, women more often prefer a girl rather than a boy for an only child. Results for men show that in nearly all studies, at least 70 percent prefer a boy. Some of these authors have tried to determine whether any variables are related to the sex preference. The most frequently found connection has been to education; in the two most recent decades, the data indicate that women who have reached university level more often prefer a girl than a boy for an only child. Men, whether they have an university education or not, prefer a boy more often. Pooler (1991) has identified two other variables: the variable ‘wife retaining her own name’ and the variable ‘religion.’ Female students who agree with the idea of retaining their own name after marriage more often prefer a girl. In addition, Jewish students prefer a girl whereas Catholics and Protestants prefer a boy. Some authors have hypothesized that the women’s preference for a female only child could be attributed to the fact that the perceived traditional female role disadvantage appears now to be significantly diminishing in Western societies. For example, Hammer and McFerran (1988) showed, significantly, that all subgroups of females (except unmarried noncollege females) would prefer to be reborn as a female.
Sex Preferences in Western Societies
3. The Preference for Sex of the Next Child Another method consists in asking individuals their sex preference for a next child, based on the existing family composition. This type of question is found habitually in fertility surveys. In general, the data indicate that women with only one child more often prefer a child of the opposite sex. Moreover, a high percentage of women prefer a child of the opposite sex when they already have two or three children of the same sex. (Men rarely participate in these fertility surveys.) For example, Marleau and Saucier (1993) showed that almost half of the women who already had a child hoped that their second child would be of the opposite sex in data from the Canadian Fertility Survey of 1984. Nearly 80 percent of women who already had two boys desired a girl for their next child, whereas nearly 50 percent of those with two girls preferred a boy. Some authors have worked with other measures, especially with the mean number of desired children. Here the results are mixed. Some studies have found that women with a boy and a girl intend to have the same mean number of children as those who have two children of the same sex, but other studies have found that the mean number of children desired is higher for the latter group. Other authors have worked with a measure such as the use of contraception by women who have already had children. A study done by Krishnan (1993) with the Canadian Fertility Survey of 1984 on women aged between 18 and 49 showed a son preference: the women who already had two sons were more likely to use contraception than those who had two girls.
Social Survey of 1990 that 58 percent of the couples with two children of the same sex were more likely to have another child as compared with 53 percent of couples with two children of both sexes. On the other hand, those who stopped childbearing the more often were those with a boy first and a boy second. No such differences in behavior occurred in couples with three children. This method is interesting because it can be computed from large databases collected for other purposes. But, it remains a weakness of this method that it gives good results only if sex preferences are relatively homogeneous in the population studied.
5. Conclusion When we compare the findings reviewed above, we note that a global tendency is revealed by the first three, namely the increasing preference among women for a girl over a boy since 1980. More research will be needed to verify whether this trend will continue and to understand the reasons for this recent shift. A further trend is revealed by the first method, namely the increasing proportion of women who have chosen the no-preference option since 1980. This trend is visible for both pregnant and nonpregnant women. See also: Family Size Preferences; Family Systems and the Preferred Sex of Children; Family Theory and the Realities of Childbearing Behavior; Family Theory: Economics of Childbearing; Fertility Control: Overview; Fertility: Proximate Determinants
4. Parity Progression Ratio
Bibliography
A method used by many authors is the parityprogression ratio. This technique consists in observing real behavior rather than being satisfied with verbal statements of attitude as in the methods mentioned above. The parity progression ratio is the proportion of couples at a given parity who have at least one additional child. If certain sex compositions of existing children are associated with a lower than average progression ratio, the inference is made that the predominant sex in those compositions is preferred. More than 30 studies were identified in the literature. In general, the data indicate that couples with one child continue to bear children regardless of the sex of the first child. Parents with two children of the same sex are more likely to go on than those with one child of each sex. For example, it was shown (Marleau and Saucier 1996) in a large sample from the Canadian General
Hammer M, McFerran J 1988 Preference for sex of child: A research update. Indiidual Psychology 44: 481–91 Krishnan V 1993 Gender of children and contraceptive use. Journal of Biosocial Science 25: 213–21 Marleau J D, Maheu M 1998 Un garc: on ou une fille? Le choix des hommes et des femmes a' l’e! gard d’un seul enfant. Population 5: 1033–42 Marleau J D, Saucier J-F 1993 Pre! fe! rence des femmes canadiennes et que! be! coises non enceintes quant au sexe du premier enfant. Cahiers queT beT cois de deT mographie 22: 363–72 Marleau J D, Saucier J-F 1996 Influence du sexe des premiers enfants sur le comportement reproducteur: une e! tude canadienne. Population 2: 460–3 Marleau J D, Saucier J-F, Bernazzani O, Borgeat F, David H 1996 Mental representations of pregnant nulliparous women having no sex preference. Psychological Reports 79: 464–6 Pharis M E, Manosevitz M 1980 Parental models of infancy: A note on gender preferences for firstborns. Psychological Reports 47: 763–8 Pooler W S 1991 Sex of child preferences among college students. Sex Roles 9/10: 569–76
13957
Sex Preferences in Western Societies Steinbacher R, Gilroy F G 1985 Preference for sex of child among primiparous women. The Journal of Psychology 124: 283–8 Uddenberg N, Almgren P E, Nilsson A 1971 Preference for sex of the child among pregnant women. Journal of Biosocial Science 3: 267–80
J. D. Marleau and J.-F. Saucier
Sex-role Development and Education Today’s debate on sex or gender is dominated by two antagonistic paradigms: sociobiology and constructivism. On evolutionary terms man’s characteristics are determined by natural selection operating at the level of genes. Man has survived millions of years of competitive struggle for existence. This by definition implies egotism as a core feature of his genes, i.e., a readiness to maximize own reproductive chances at the cost of others. Egotistic genes may produce ‘altruistic’ behavior in case the net reproductive success of own genes (embodied in close relatives) can be increased (kinship altruism) or future repayments in emergencies can be secured (reciprocal altruism). Gene egotism has implications for sex differences. Across all species females’ investment in reproduction is greater: they produce fewer and larger, i.e., more costly gametes; among mammals they invest time in carrying and nursing the young and among humans in taking care of them during an extended phase of dependency. Both sexes—being but vehicles of egotistic genes—strive to maximize reproductive success. Different strategies, however, will be efficient. In view of their high pre-investment, taking care of the young pays for females and given that females will take care, it pays better for males to spread their genes as widely as possible. For females sexual reserve is the better strategy—it allows them to test a male’s fidelity and increase his investments; and it pays to select high status males, since their greater resources increase chances of survival for the few young a female will be able to bear. In contrast, males profit from quick seductions and from selecting females according to beauty and youthfulness—criteria that indicate high reproductive capacity. Further assumptions are needed: given that genes happen to reside half of the time of their existence in male, the other half in female bodies, they have to be assumed to operate differently in different environments. Also, the adaptive value of present characteristics can be justified only by reference to life conditions of our ancestors—of which we know little. Constructivism draws a contrary picture. Sex differences—so its core claim—are but social constructions. In fact, the classificatory system itself is a modern Western design stipulating the following 13958
features (see Tyrell 1986). (a) Reference to physical aspects, i.e., the presence\absence of a penis (some cultures focus more on social activities like childcare or warfare). (b) Binary, i.e., a strictly exclusive categorization (some cultures allow for or even ascribe a positive status to hermaphrodites). (c) Inclusive, i.e., all individuals are classified even those with unclear genetic make-up (using an operation to improve outward appearance if necessary). (d) Irreversible (except via operation) (some cultures allow for social sex role changes). (e) Ascriptive from birth on (some cultures define children as neutrals and ascribe sex role membership only in initiation rituals). The very assumption of large sex differences is a modern idea that from the constructivist perspective is mistaken: it is an essentialist reification of what in fact is but a cooperative interactive achievement. Humans—so the basic tenet—don’t ‘have’ a sex and they ‘are’ not males or females, rather they ‘act’ and ‘see’ each other as such. Studies on transsexuality analyze the ways in which individuals perform and recognize gender. The two paradigms are based on opposite assumptions. In sociobiology man is but a vehicle for powerful genes, in constructivism he is an omnipotent creator. In sociobiology he is a lone wolf entering social relations only for reproductive concerns, in constructivism he is a social being whose very (even sexual) identity is dependent on interactive co-construction. Nevertheless, both paradigms concur in their ahistorical approach. In sociobiology genes that survived under the living conditions of our ancestors determine human dispositions forever—across all periods and societies. In constructivism man is created ever anew in each interaction situation. Both approaches simply ignore the power of history—consolidated in collective traditions and reproduced in biographical learning processes. These will come to the fore in the following analysis of present sex role understanding. Three aspects will be discussed: its empirical description, historical emergence, and ontogenetic development.
1. Sex Differences—a Descriptie Perspectie Sex differences can be analyzed on various levels: on the level of the individual, the culture, the social structure.
1.1 Psychological Leel Recent meta-analyses of US data show that with greater educational equality sex differences in cognitive performance have largely disappeared over the past few decades except for a slight overrepresentation of males at the very top of mathematical and the very bottom of verbal abilities and an average higher male performance in spatial ability tasks. Male spatial superiority is even greater among
Sex-role Deelopment and Education Mexicans but has not been found among Eskimos. These latter findings suggest a connection between the development of spatial understanding and cultural differences in degree of supervision and control exerted upon young girls. It has been claimed that morality is gendered: women are maintained to be more flexible and careoriented, men to be more rigidly oriented to abstract principles and more autonomous. These differences are seen to arise from differences in the structure of self shaped by the early experience of female mothering. Girls can maintain the primary identification with the first caretaker (relational self ), while boys—in order to become different—have to distance themselves (autonomous self ) (Gilligan and Wiggins 1988). Neither flexibility nor care, however, are specific to women. Flexibility is a correlate of a modern moral understanding. Kant had still ascribed exceptionless validity to negative duties given that God—not man—was held responsible for any harm resulting from compliance. On innerwordly terms, however, impartially minimizing harm is given priority over strict obedience to rules. If exceptions are at all deemed justifiable they will more likely be conceded by those who are aware of possible costs incurred by anyone affected. Individuals who are personally involved will be more knowledgeable of such costs. This may explain why women—in agreement with Gilligan— were found to judge more flexibly with respect to abortion, yet at the same time more rigidly with respect to the issue of draft resistance than men (Do$ bert and Nunner-Winkler 1985). Besides, those in power can insist more rigidly on their convictions, thus flexibility might be the virtue of subordinates. Care is not part of a relational self-structure produced in early childhood—rather it is part of the female role obligation. Indeed, preschool girls showed no more empathic concerns than boys (Nunner-Winkler 1994), but a majority of (especially older) German subjects more often justified strictly condemning working mothers by referring to their dereliction of duty and to their egotistic strivings for self-fulfillment than to the harm their children might suffer.
1.2 Cultural Leel Two aspects—although empirically concurring—need to be distinguished: gender stereotypes and genderrole obligations. Gender stereotypes are collectively shared assumptions about the different ‘nature’ of men and women. Across cultures men are assumed to be aggressive, independent, and assertive, women to be emotional and sensitive, emphatic, and compliant. Contradictory evidence does not detract from such persuasions— immunity to empirical refutation is the very core of stereotypes and there are mechanisms to uphold them.
Expectations guide the way observations are perceived, encoded, and interpreted, e.g., noncompliance is perceived as a sign of strength if shown by a man, of dogmatism if shown by a woman. Behavior that conforms to expectations is encoded on abstract terms, discrepant behavior with concrete situational details. This eases making use of the ‘except clause’ when interpreting deviant cases (e.g., for a woman she is extraordinarily assertive). Thus, suitably framed and interpreted even conflicting observations can stabilize stereotypes. With gender-role obligations, women are assigned to the private sphere, men to the public realm. Thus, taking care of children and household chores is seen to be primarily women’s task, breadwinning men’s. These roles are defined by contrasting features. Family roles are ascribed, diffuse, particularistic, affective and collectivity-oriented; occupational roles are achieved, specific, universalistic, affectively neutral, and selforiented (Parsons 1964). Identifying and living in agreement with one’s gender role will influence ways of reacting, feeling, and judging. Thus, gender differences might be understood as a correlate not primarily of genetic dispositions or of a self-structure shaped in infancy, but rather of the culturally institutionalized division of labor between the sexes.
1.3 Sociostructural Leel Gender stereotypes and role obligations influence career choice and commitment to the occupational sphere. In consequence, there is a high gender segregation of the workforce. The proportion of women is over 90 percent in some fields (e.g., secretary, receptionist, kindergarten-teacher) and less than 5 percent in others (e.g., mechanic, airplane pilot). Jobs that are considered women’s work tend to offer fewer opportunities for advancement, less prestige, and lower pay than jobs occupied primarily by men. Worldwide the gender gap in average wage is 30–40 percent and it shows little sign of closing. Top positions in economy, politics, and sciences are almost exclusively filled by men, and part-time working is almost exclusively a female phenomenon. Both men and women tend to hold negative attitudes towards females in authority. Women entering male occupations are critically scrutinized, males entering female occupations (e.g., nursing) in contrast easily win acceptance and promotion.
2. The Historical Emergence of Gender Differences There are two (partly independent) dimensions implied in the debate on gender roles: the hierarchical one of equality vs. inequality of rights and the horizontal one of difference vs. sameness in personality 13959
Sex-role Deelopment and Education make-up. We begin with equality. According to medieval understanding individuals find themselves in different social positions by the will of God and it is God who commanded that women obey men. Enlightenment declared all men to be equal—irrespective of gender (or race). Thus, a new justification was needed if the subordination of women (or black slaves) was to be maintained. This instigated a search for ‘natural’ differences between women and men (between blacks and whites) that soon succeeded in specifying differences in brain size or the shape and position of sexual organs (in IQ)—much to the detriment of women (or blacks). Legal discrimination has largely discontinued. Women (and blacks) are granted full rights to vote or to participate in the educational system. The assumption of gender differences, however, is still prevalent. It arose in consequence of the industrialization process. In agricultural economies women had their own sphere of control (house, garden, cattle, commercialization of surplus products) and their contribution was essential for subsistence: ‘These women in no way resemble the 19th century image of women as chaste, coy, demure … Peers describe them as wild, daring, rebellious, unruly’ (Bock and Duden 1977). Industrialization led to a separation of productive and reproductive work, i.e., to the contrast between familial and occupational roles described above. With rapid urbanization and increasing anonymity around the turn of the twentieth century, antimodernist discontent grew. Increasingly, female ‘complementary virtues,’ e.g., empathy, emotionality, sensitivity came to be seen as bulwark against the cold rationality of the structure of capitalist economy and bureaucratic administration. This sentiment was (and partly still is) shared even by feminists deeply committed to secure legal, political, and social equality for women.
3. Ontogenetic Deelopment Each generation of newborns is an invasion of barbarians—nonetheless, within a decade or two most turn into useful members of their specific society. How is this effected? Education is too narrow a term in that it refers primarily to methods purposefully applied in order to produce desired results. Children, however, are influenced not primarily by planned educational actions, but rather by the entirety of their life conditions. They are not merely passive objects to social instruction, rather—in mostly implicit learning processes—they actively reconstruct the basic rule systems underlying their experiences. This way they acquire knowledge systems, value orientations, action dispositions. In this (self-)socialization process different learning mechanisms are at work: classical and instrumental conditioning produce response tendencies; 13960
through bestowal and withdrawal of love a conformity disposition is shaped; through parental authority and firmness the internalization of values is furthered; children imitate behavior of models they deem interesting and they implicitly recognize regularities and rule structures. With increasing cognitive and ego development reflexive self-distancing from and consciously taking a stance towards one’s previous learning history becomes possible. How are sex roles acquired? Increasingly parents advocate identical educational goals for boys and girls. Nevertheless, unwittingly, especially fathers tend to treat them differently—handling male infants more roughly and disapproving of sissy behavior (Golombok and Fivush 1994). Also, from early on, children prefer same-sex playmates (Maccoby 1990). Such early experiences may leave some traces. More direct sex role learning, however, seems to depend on sociocognitive prerequisites. A change has been documented to occur in children’s understanding of concepts. They shift from focusing on externally observable surface features to basic definitional criteria (in the case of nominal terms) or to the assumption of stable and essential inner characteristics that all members of a given category share (in the case of natural kind terms). Sex is treated like a natural kind term, i.e., sex is understood to remain constant despite outward changes, to denote some ‘essential’ even if unobservable commoness, and to allow for generalizing new information across all members of the same category (Gelman et al. 1986). This constitutes a universal formal frame of reference (that makes stereotypical thinking so irresistible). It needs to be filled with content. Children learn what is typical and appropriate for men and women in their culture by beginning to selectively observe and imitate exemplary same-sex models (Slaby and Frey 1975). Largely, this learning process is intrinsically motivated (Kohlberg 1966), i.e., by the desire to be a ‘real boy\girl’ and become a ‘real man\woman.’ It proceeds by (mostly implicitly) reconstructing those gendered behavioral, expressive, and feeling rules that are institutionalized in the given culture. In our culture there are many cues from which children will read what constitutes sexappropriate demeanor. The sexual division of labor is seen already in the family: women are more likely to sacrifice their career for the family (and men their family-life for their career) and even full-timeemployed mothers spend considerably more time on childcare and housework than fathers. In the school, teachers tend to give more attention to boys and praise them for the quality of their work (while praising girls for neatness). Curricula may segregate the sexes, offering home economics to girls, contact sport or mechanic training to boys. The social structure of the school impresses the idea of male authority over women with most principals in elementary school being male although most teachers are female. In fact, it has been found that first-graders attending schools
Sex-role Deelopment and Education with a female principal display less stereotypical views on gender roles than children in schools with a male principal. These early lessons on sex differences are reinforced in public: in politics and business top positions are mostly filled by men and books, films, TV, and advertisements depict men as dominant, powerful, and strong, women as beautiful, charming, and yielding. Thus, consciously treating boys and girls alike in family and kindergarten will be of little avail in counterbalancing the impressive overall picture of structural sexual asymmetry and alleged personality differences between the sexes in personality characteristics and behavioral dispositions.
are low (1.3; 59.7 percent). The German welfare system is described as paternalistic (e.g., women are offered extended publicly financed maternity leaves yet there are hardly any daycare centers for infants or all-day schools and the disapproval of working mothers is especially high (Garhammer 1997). Such differences between countries may indicate that social policy does have an important part to play in reducing or reproducing gender inequalities. See also: Education and Gender: Historical Perspectives; Education (Primary and Secondary Schools) and Gender; Gender and School Learning: Mathematics and Science; Gender Differences in Personality and Social Behavior
4. The Future of Sex Roles With modernization, ascriptive categories lose importance. Social systems increasingly come to be differentiated in subsystems each fulfilling a specific function and operating according to its own code (Luhmann 1998). Gender is the code of the family; it cannot substitute for the codes of other subsystems, e.g., in science it is the truth of a statement, on the market it is the purchasing power, in court it is the lawfulness of the sentence that counts; the gender of the author, the customer, the judge are (or should be) irrelevant. True, there still exists inequality between the genders. Nevertheless, all over the world it has been drastically reduced over the past few decades. In all countries women have profited more by the educational expansion of the 1960s and 1970s; while at the beginning of the twentieth century in less than 1 percent of 133 countries analyzed voting rights were conceded to women, today all of them with male franchise (over 90 percent) have extended suffrage to women (Ramirez et al. 1997). Many countries have inserted a clause concerning equal social participation rights for women in their constitution and set up special institutions; also some have introduced affirmative actions. Increasingly, women come into prestigious positions which will improve chances for succeeding women who now meet with role models and old girls’ networks. Nevertheless, merely increasing the proportion of women on top will not suffice. It may only increase the tendency of a split up of female biographies with women opting for either family or career (while men can have both). A real change requires a more equal distribution of productive and reproductive work between the genders. In this respect Sweden has been quite successful by institutionalizing an egalitarian welfare regime (e.g., providing publicly financed daycare centers across the whole country; granting a generous leave of absence to both parents at child birth, and offering a specific paternity leave to fathers which 83 percent make use of ). In Sweden we find a high birthrate (1.8) along with a high rate of female employment (75.5 percent). This stands in contrast to the situation in Germany where both rates
Bibliography Bock G, Duden B 1977 Arbeit aus Liebe—Liebe als Arbeit. Zur Entstehung der Hausarbeit im Kapitalismus. Frauen und Wissenschaft. Beitra$ ge zur Berliner Sommeruniversita$ t fu$ r Frauen Juli 1976. Berlin, 118ff Do$ bert R, Nunner-Winkler G 1985 Value change and morality. In: Lind G, Hartmann H A, Wakenhut R (eds.) Moral Deelopment and the Social Enironment. Precedent Publishing, Inc., Chicago, pp. 125–53 Garhammer M 1997 Familiale und gesellschaftliche Arbeitsteilung—ein europa$ ischer Vergleich. Zeitschrift fuW r Familienforschung 9: 28–70 Gelman S A, Collman P, Maccoby E E 1986 Inferring properties from categories versus inferring categories from properties: The case of gender. Child Deelopment 57: 396–404 Gilligan C, Wiggins G 1988 The origins of morality in early childhood relationships. In: Gilligan C, Ward J V, Taylor J M (eds.) Mapping the Moral Domain: A Contribution of Women’s Thinking to Psychological Theory and Education. Harvard University Press, Cambridge, MA, pp. 110–38 Golombok S, Fivush R 1994 Gender Deelopment. Cambridge University Press, New York Kohlberg L 1966 A cognitive-developmental analysis of children’s sex-role concepts and attitudes. In: Maccoby E E (ed.) The Deelopment of Sex Differences. Leland Stanford Junior University, Stanford, CA, pp. 82–173 Luhmann N 1998 Die Gesellschaft der Gesellschaft. Suhrkamp, Frankfurt a.M Maccoby E E 1990 Gender and relationships. A developmental account. American Psychologist 4: 513–20 Nunner-Winkler G 1994 Der Mythos von den zwei Moralen. Deutsche zeitschrift fuW r Philosophie 42: 237–54 Parsons T 1964 The Social System. The Free Press of Glencoe, London Ramirez F, Soysal Y, Shanahan S 1997 The changing logic of political citizenship: Cross-national acquisition of women’s suffrage rights, 1890–1990. American Sociological Reiew 62: 735–45 Slaby R G, Frey K S 1975 Development of gender constancy and selective attention to same-sex models. Child Deelopment 46: 849–56 Tyrell H 1986 Geschlechtliche Differenzierung und Geschlechterklassifikation. KoW lner Zeitschrift fuW r Soziologie und Sozialpsychologie 38: 450–89
G. Nunner-Winkler Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
13961
ISBN: 0-08-043076-7
Sex Segregation at Work
Sex Segregation at Work All societies organize work through a sexual division of labor in which women and men typically perform different tasks. These tasks may also depend as well on people’s age, race, ethnicity, nativity, and so forth. Although this sexual division of labor divides market and non market work by sex, sex segregation usually refers to the sexual division of labor among paid workers. Thus, sex segregation is the tendency for the sexes to do different kinds of paid work in different settings. Customarily, ‘segregation’ denotes physical separation, as in school segregation by race. However, the term ‘sex segregation’ refers to physical, functional, or nominal differentiation in the work that women and men do. ‘Men’s work’ does not simply differ from ‘women’s work,’ however; predominantly male jobs tend to be better and more highly valued. This occurs both in vertical segregation which assigns men higherlevel jobs than women in the same occupation, and in horizontal segregation in which the sexes pursue different occupations.
underestimate the effect of occupational segregation on the pay gap: the amount of sex segregation in metropolitan labor markets affects how much segregation affects women’s and men’s earnings with all women losing pay and all men gaining in highly segregated markets, and no pay gap if occupational segregation were very low (Cotter et al. 1997, p. 725). Men dominate the more lucrative jobs partly because employers reserve such jobs for men and partly because female work is devalued (England 1992). The under-valuation of female activities not only shrinks women’s pay, it also reduces their economic independence, familial power, and sense of entitlement. Sex segregation also creates disparities in the sexes’ authority, mobility opportunities, working conditions, and chance to acquire skills. Segregation disproportionately relegates women to dead-end jobs or jobs on short career ladder, thus reducing their aspirations and opportunity for mobility ladders (Baron et al. 1986). This vertical segregation creates a ‘glass ceiling’ that concentrates women in the lower level positions and reduces their authority.
2. Theories of Segregation 1. Sex Segregation and Sex Inequality Segregation—whatever its form—is a key engine of inequality. The domains reserved for members of dominant and subordinate groups are not just different, they are unequal, with the more desirable domains reserved for members of dominant groups. Because economic, social, and psychological rewards are distributed through people’s jobs, segregation both facilitates and legitimates unequal treatment. Sex segregation reinforces sex stereotypes, thereby justifying the sexual division of labor it entails, and it reduces equal-status cross-sex contacts that could precipitate challenges to differentiation. In sum, workplace segregation benefits men and harms women (Cotter et al. 1997).
1.1 Consequences of Sex Segregation Job, establishment, and occupational segregation account for most of the differences in the rewards that men and women garner from employment. The most important of these differences is earnings. Estimates of the importance of occupational segregation for the pay gap between the sexes vary widely, depending on the analytic approach. An estimated one-third of the earnings gap results from occupational segregation in the USA and abroad (Anker 1998). Establishmentlevel data for the USA, Norway, and Sweden by Petersen and his colleagues attribute about threequarters of the gap to occupational segregation (Petersen and Morgan 1995). This approach may still 13962
The most common theories of sex segregation emphasize the preferences of workers and employers. Hypothetically, gender-role socialization or a sexual division of domestic labor that encourages men to pick jobs that maximize earnings and women to select jobs that facilitate child-rearing lead to sex-differentiated preferences (England 1992). Men’s vested interests also hypothetically prompt them to exclude women from ‘men’s jobs.’ The universality of the assumed sexspecific preferences limits the value of preference explanations: because preferences theoretically vary between, not within the sexes, the sexes should be completely segregated. Empirical evidence casts doubt on preference theories: youthful occupational preferences only loosely related to the sex composition of adults’ occupation (see Jacobs 1989, 1999). Moreover, the sexes similarly value high pay, autonomy, prestige, and advancement opportunities, which should minimize between-sex differences. Explanations focusing on employers’ preferences stemming from sex biases and attempts to minimize employment costs through statistical discrimination are limited by their emphasis on hard-to-measure motives. Although it stands to reason that employers influence job-level segregation because they assign workers to jobs, it is difficult to learn why. However, by examining the effects of training and turnover costs and skill on women’s exclusion and sex segregation, Bielby and Baron (1986) showed that employers discriminated statistically against women by imputing to individuals stereotypical characters of their sex, but contrary to neoclassical economic theory, this practice was neither efficient nor rational.
Sex Segregation at Work In sum, standard theoretical approaches to segregation predict universally high segregation rather than explaining variation in its level. Although these theories have stimulated considerable research, it is difficult to directly test them, so the importance of women’s and employers’ ‘choices’ remains a matter of debate. More fruitful explanations (summarized later) try to explain variation in segregation across aggregates or over time.
3. Measuring Segregation Segregation is a property of aggregates, not individuals. Its measurement depends on how it is conceptualized. The most common conceptualization— the sexes’ different representation across work categories—is measured by the index of dissimilarity, D, whose formula is Σ fikmi\2 ( fi and mi represent the percentages of female and male workers in each work entity). D is the percentage of either female or male workers who would have to change to a work category in which they are underrepresented for the sexes’ distributions across categories to be identical. The size of D depends both on the extent of segregation and the relative sizes of the work entities (e.g., occupation) so it is unsuitable for comparing segregation across populations with work entities of different relative sizes. A size-standardized variant of D which holds entity size constant permits comparing levels of segregation across populations. Segregation is also conceptualized as the concentration of the sexes in work entities dominated by one sex. This conception is relevant to the thesis that women are disproportionately crowded into relatively few occupations. A simple measure of concentration is the proportion of each sex in the n occupations that employ the most men or women. (For an index of concentration, see Jacobs (1999) who warns that concentration measures assume that female and male occupations are aggregated to the same degree.) A third conception of segregation is each sex’s probability of having a coworker in their job or occupation of the same or the other sex, designated as P* (Jacobs 1999). The value of P* depends on both the extent of segregation and the sex composition of the population. Occupations can be described in terms of their sex composition, and are sometimes characterized as ‘integrated’ or ‘segregated’ or ‘female- or male dominated’ based on whether one sex is disproportionally represented. For example, in the USA, billing clerk is a segregated occupation because 89 percent of its incumbents are female, whereas women are just 47 percent of the labor force. The individual-level experience of segregation is the sex composition of her or his job or occupation, so a billing clerk holds a predominantly female occupation. In 1992, Charles proposed using log linear methods to study sex segregation. Because log linear methods
adjust for both the varying sizes of work entities and the sexes’ respective shares of the labor force they are useful for studying why the extent of segregation varies cross-nationally and over time. They also allow researchers to estimate the impact of independent variables on a sex’s representation in specific occupations or occupational categories, net of the occupational structure (Charles 1992, p. 488, 1998). All measures of segregation are sensitive to the level of aggregation of the work units of interest. If segregation exists within units—which it usually does—the more aggregated the units, the less segregation will be observed. Although fully disaggregated data, such as establishment-level data, are preferable, they are hard to come by. Cross-national data are particularly likely to be aggregated because researchers need to make work units comparable across countries and because the data some countries collect are broadly aggregated.
4. The Extent of Segregation Most research on segregation focuses on the sexes’ concentration in different occupations. (An occupation comprises jobs that involve similar activities within and across establishments.) In a ‘typical’ country in the early 1990s about 55 percent of workers held either ‘male’ or ‘female’ jobs (Anker 1998, p. 5). Although countries vary considerably in the extent to which the sexes are segregated occupationally, in the 1990s most developed nations had segregation indices between 50 and 70 (Anker 1998). In 1997, the segregation index for detailed occupations in the USA was 53.9 (Jacobs 1999). The sexes are most segregated in the Middle East and Africa. In the early 1990s, segregation indices across detailed occupations were at least 70 for Tunisia, Kuwait, and Jordan, largely because of Muslim proscriptions against contact between the sexes. Most of the EU societies and ‘transition economies’ (former Soviet satellites) had indices in the mid-50s. Segregation was lowest in Asian\Pacific nations, with indices ranging from 36 (China) to 50 (Japan). China’s low segregation level results from its large, sex-integrated agricultural sector. Among the Western developed countries, Italian and US workers were least segregated, and Scandinavians most segregated. In these countries, occupational segregation is positively related to women’s labor force participation; indeed the same factors facilitate women’s labor force participation and encourage sex segregation—paid parental leave and part-time work which lead to statistical discrimination against women and the shift of customarily female domestic tasks into the market economy (Charles 1992). Also contributing to cross-national differences in segregation is the relative sizes of manufacturing and service sectors, since men tend to dominate the 13963
Sex Segregation at Work former and women the latter (Charles 1998). Variation in national commitment to gender ideology also explains cross-national differences in segregation (Charles 1992). 4.1 Segregation Across Jobs Much higher levels of sex segregation exist in establishment-level studies of job segregation (a job is a position in an establishment whose incumbents perform particular tasks) because people in the same occupation work in different establishments or different jobs (Bielby and Baron 1984; Tomaskovic-Devey 1993). For example, male lawyers are more likely than female lawyers to work for law firms, and within law firms men dominate litigation. Organizations’ characteristics and personnel practices affect how segregated they are. Small and large establishments are the most segregated. Small establishments are segregated because they employ workers of only one sex; whereas large bureaucracies have segregative personnel practices, such as sex segregated job ladders (Bielby and Baron 1984).
5. Trends in Segregation In the last quarter of the twentieth century segregation fell in most of the world, although women continued to dominate clerical and service occupations, and most customarily male production jobs remained inaccessible to women (Charles 1998). Segregation increased during the 1970s and 1980s in Asian\Pacific countries; remained constant in Southern and Central European countries; and declined in the USA, Western Europe, the Middle East and North Africa, and other developing countries (Anker 1998, p. 328). In most countries declines reflected increased occupational integration as well as shifts in the occupational structure in which heavily segregated occupations shrank. In the USA, increased integration resulted primarily from women’s movement into customarily male occupations. The higher pay in customarily male jobs draw women. Large numbers of men enter customarily female occupations only when they become much more attractive, and such changes are rare. The sizestandardized index was stable between 1990 and 1997, indicating that after 1990 real occupational integration halted, although fewer persons worked in heavily segregated occupations (Jacobs 1999).
to integration. In the USA, for example, as fewer women majored in education and more majored in business, educational segregation declined and occupations integrated. The shrinking experience gap between the sexes may also reduce segregation if it relieves employers’ misgivings over hiring women for jobs that involve firm-specific skills. However, employers’ hiring and job-assignment practices are the proximate determinants of the level of segregation. Many establishments employ women in customarily male jobs only when there are not enough qualified men. Thus, organizational and occupational growth both foster integration if they cause a shortfall of male labor. Shortages have also brought about integration when occupations deteriorated in their pay, autonomy, or working conditions (e.g., pharmacist, typesetter; Reskin and Roos 1990) and thus became less attractive to men. Employers who face economic penalties for segregating and whose job assignments are monitored are more likely than are others to integrate jobs (Baron et al. 1991). Thus, anti-discrimination and affirmative action regulations have fostered integration by making it illegal for employers to exclude workers from jobs on the basis of their sex. These rules have made a difference largely through ‘class-action’ lawsuits against employers. Anti-discrimination laws are ineffective in reducing segregation when regulations are not enforced and when class-action lawsuits are not permitted. Finally, according to development theory, modernization replaces ascription with achievement in filling social positions. However, modernization is associated with the growth of economic sectors that most societies label as ‘women’s work.’ Thus, the net effect of modernization, as measured by GNP, has been segregative because it expands female dominated clerical and sales occupations (Charles 1992). 5.2 Nominal Integration What appears to be genuine integration may be a stage in a process of re-segregation in which women are replacing men as the predominant sex. In the USA, for example, insurance adjusting and typesetting re-segregated over a 20-year period when these occupations changed from male to female dominated. For individual women, the corresponding process is the ‘revolving door’ through which women in male occupations pass back into customarily female occupations (Jacobs 1989).
5.1 Explaining the Decline in Segregation According to the theoretical framework summarized above, the more similar the sexes become in their preferences and skills, the less segregated they will be. According to cross-national research, the narrowing of the education gap between the sexes has contributed 13964
5.3 Explaining Stability Because the amount of sex segregation has changed so little, scholars have devoted as much attention to explaining stability as to explaining change. Unchanging levels of segregation may result from stability in
Sex Therapy, Clinical Psychology of the causal variables or from processes whose opposing effects cancel out each other. Gender ideology strongly favors stability. Many occupations are sex-labeled, representing a cultural consensus that they are appropriate for one sex but not the other (e.g., childcare worker, automobile mechanic). Although these labels are not deterministic (indeed they sometimes vary across societies), they predispose employers to prefer sex-typical workers and workers to prefer sex-typical jobs. Although state policies have altered gender ideology, it is usually a conservative force. Work organizations also resist change in customary business practices. Organizations’ structures and practices reflect the cultural environment present when they were founded, and inertia preserves these structures and practices. Such practices include sex-segregated promotion ladders, credentials that are required for jobs, and the use of workers’ informal networks to fill jobs. For example, recruitment through networks perpetuates the sex composition of jobs because people’s acquaintances tend to be of their same sex. Statistical discrimination or stereotype-based job assignments perpetuate the sex composition of jobs because they prevent employers from discovering that their stereotypes are unfounded. Thus, barring external pressures to alter job assignments, job segregation is unlikely to change. Of course, stable levels of segregation may be misleading if they are the result of counteracting forces that have opposing effects on the level of sex segregation. For example, in the 1980s Japan’s occupations became slightly more integrated, but this was offset by the increased number of workers employed in occupations that were either male or female dominated. The net effect of the growth of female dominated clerical and service sectors in most economically advanced countries will cancel out some shift toward greater integration within occupations. See also: Affirmative Action: Comparative Policies and Controversies; Affirmative Action: Empirical Work on its Effectiveness; Affirmative Action Programs (India): Cultural Concerns; Affirmative Action Programs (United States): Cultural Concerns; Affirmative Action, Sociology of; Discrimination; Education (Higher) and Gender; Equality of Opportunity; Feminist Legal Theory; Gender and School Learning: Mathematics and Science; Gender and the Law; Modernization, Sociological Theories of; Sex Differences in Pay
Bibliography Anker R 1998 Gender and Jobs: Sex Segregation of Occupations in the World. International Labour Office, Geneva, Switzerland Baron J N, Davis-Blake A, Bielby W T 1986 The structure of opportunity: How promotion ladders vary within and among
organizations. Administratie Science Quarterly 31: 248–73 Baron J N, Mittman B S, Newman A E 1991 Targets of opportunity: Organizational and environmental determinants of gender integration within the California Civil Service 1979–1985. American Journal of Sociology 96: 1362–1401 Bielby W T, Baron J N 1984 A woman’s place is with other women. In: Reskin B F (ed.) Sex Segregation in the Workplace: Trends, Explanations, Remedies. National Academy Press, Washington, DC Bielby W T, Baron J N 1986 Men and women at work: Sex segregation and statistical discrimination. American Journal of Sociology 91: 759–99 Charles M 1992 Cross-national variation in occupational sex segregation. American Sociological Reiew 57: 483–502 Charles M 1998 Structure, culture, and sex segregation in Europe. Research in Social Stratification and Mobility 16: 89–116 Cotter D A, DeFiore J A, Hermsen J M, Kowalewski B, Vanneman R 1997 All women benefit: The macro-level effects of occupational integration on earnings inequality. American Sociological Reiew 62: 714–34 England P 1992 Comparable worth. Theories and Eidence. Aldine de Gruyter, New York Gross E 1968 Plus c: a change … ? The sexual structure of occupations over time. Social Problems 16: 198–208 Hakim C 1992 Explaining trends in occupational segregation: The measurement, causes, and consequences of the sexual division of labor. European Sociological Reiew 8: 127–52 Jacobs J A 1989 Reoling Doors. Stanford University Press, Stanford, CA Jacobs J A 1999 The sex segregation of occupations: Prospects for the 21st century. Forthcoming In: Powell G A (ed.) Handbook of Gender in Organizations. Sage, Newbury Park, CA Petersen T, Morgan L A 1995 Separate and unequal: Occupation-establishment sex segregation and the gender wage gap. American Journal of Sociology 101: 329–65 Reskin B F 1993 Sex segregation in the workplace. Annual Reiew of Sociology 19: 241–70 Reskin B F, Roos P A 1990 Job Queues, Gender Queues: Explaining Women’s Inroads Into Customarily Male Occupations. Temple University Press, Philadelphia, PA Tomaskovic-Devey D 1993 Gender and Racial Inequality at Work. Cornell University, Ithaca, NY
B. F. Reskin
Sex Therapy, Clinical Psychology of Sex therapy is an approach to the treatment of sexual problems. Results of psychophysiological studies of sexual responses in sexually functional subjects, in the 1960s, allowed gynecologist William Masters and psychologist Virginia Johnson to develop a therapeutic format for the treatment of sexual inadequacy. Apart from organic causes, sexual problems may be attributed to interpersonal difficulties and to problems with emotional functioning. Masters and Johnson were convinced that ‘adequate’ stimulation would result in sexual response in an engaged partner. Adequate stimulation feels good, is rewarding, and facilitates focus of attention on those feelings. Orig13965
Sex Therapy, Clinical Psychology of inally, adequate stimulation in therapies was restricted to behaviors comprising the classical heterosexual foreplay, coitus, and orgasm sequence. This criterion for adequate stimulation has been abandoned because of the rich diversity in sexual practices that exist in the real world. Masters and Johnson hypothesized that fear of failure, and as a consequence becoming a spectator of one’s own feelings, is the most important cause for sexual inadequacy. They devised a very ingenious procedure to assist people in becoming engaged in feelings of sexual excitement. Three important steps were specified. In the first step people learn to accept that mutual touching feels good. In this step demands for sexual performance, and anxieties related to such demands, are precluded by an instruction not to touch genitals or secondary sex characteristics (e.g., breasts). When the first step leads to acceptance of positive bodily feelings, the second step demands including genitals and breasts in mutual touching and caressing. Masters and Johnson have suggested some variations to accommodate specific sexual problems (e.g., premature ejaculation, erectile problems). The second step has to result in the physiological signs of sexual excitement—vaginal lubrication in women and erection of the penis in men. In the classical sequence the first and second step prepared for the third step which consisted of coital positions, again with variations, and stimulation through coitus to orgasm. Although coitus may be relevant from a reproductive point of view, it is not always and certainly not the only way to experience the ultimate pleasures of sex.
1. Desire, Excitement, and Orgasm Masters and Johnson, on the basis of their psychophysiological studies, proposed a model of sexual response consisting of three phases: an excitement phase, an orgasm phase, and a resolution phase. The excitement phase and orgasm phase may be recognized as the second and third step of the classic treatment format. When other therapists began to apply Masters and Johnson’s format it became clear that many patients do not easily engage in interactions prescribed by the sex therapy format. Apparently, an initial motivational step was missing. People who hesitate or avoid intimate interactions may lack desire, which often means that they do not expect the interaction to be rewarding. In 1979 Helen Kaplan proposed adding a desire phase, preceding the three phases specified by Masters and Johnson. Since Kaplan’s proposal it has become clear that the prevalence of lack of desire is considerable. With hindsight most people accept that lack of desire must be the most important sexual problem. It is a problem for the individual who does not arouse desire in his or her partner. In most instances, not feeling desire is in itself unproblematic, but lack of sexual desire may become a problem in the relationship. 13966
Sex therapy was a fresh and new treatment. Sexual problems were openly discussed, there was no timeconsuming delving into past conflicts, and there were suggestions for a direct reversal of symptoms of sexual failure. Masters and Johnson preferred working with couples because the interaction within the couple often contributes in important ways to the sexual difficulties. Other therapists have offered treatment to individuals and to groups of individuals or couples. An alternative to Masters and Johnson’s therapy format is the mimicking of normal sexual development through the use of masturbation. This has been an important step for many women, especially those who missed this aspect of discovery of their own sexuality. In this approach people learn to induce sexual excitement through masturbation to eventually apply this ‘skill’ in interaction with a partner. Some approaches developed from behavior therapy and rational emotive therapy focused on performance anxiety and fear of failure. Others used interventions from couple and group therapy. It is fair to say that nowadays almost all approaches to sexual difficulties incorporate elements from the Masters and Johnson sex therapy format.
2. Sexual Dysfunctions in Men 2.1 Diagnostic Procedures The aim of the initial clinical interview is to gather detailed information concerning current sexual functioning, onset of the sexual complaint, and the context in which the difficulty occurred. This information gathering may be aided by the use of a structured interview and paper-and-pencil measures regarding sexual history and functioning. An individual and conjoint partner interview, if possible, can provide additional relationship information and can corroborate data provided by the patient. The initial clinical interview should help the clinician in formulating the problem. It is important to seek the patient’s agreement with the therapist’s formulation of the problem. When such a formulation is agreed upon, the problem may guide further diagnostic procedures. Many men with erectile dysfunction may be wary of psychological causes of their problem. Psychological causes seem to imply that the man himself is responsible for his problem. This may add to the threat to his male identity that he is already experiencing by not being able to function sexually. Considering the way a man may experience his problem, it can be expected that it will not be easy to explain to him the contribution of psychological factors. A clinician knowledgeable in biopsychosocial aspects of sexual functioning should be able to discuss the problem openly with the patient. Dysfunctional performance is meaningful performance in the sense that misinformation, emotional states, and obsessive concerns about performance provide information about the
Sex Therapy, Clinical Psychology of patient’s ‘theory’ of sexual functioning. When contrasting this information with what is known about variations in adequate sexual functioning, it is often clear that one cannot but predict that the patient must fail. For the clinician a problem arises when, even with adequate stimulation and adequate processing of stimulus information according to the clinician’s judgment, no response results, either at a physiological or a psychological level. At this point, a number of assessment methods aimed at identifying different components or mechanisms of sexual functioning may be considered. In principle, two main strategies may be followed: In the first, although a psychological factor interfering with response cannot be inferred from the report of the patient, one can still suspect some psychological factor at work. Possibly the patient is not aware of this factor, thus he cannot report on it. Eliminating this psychological influence may result in adequate response. The second strategy applies when even with adequate (psychological) stimulation and processing, responding is prevented by physiological dysfunction. Physiological assessment may then aid in arriving at a diagnostic conclusion. The biopsychosocial approach predicts that it is inadequate to choose one of these strategies exclusively. The fact that sexual functioning is always psychophysiological functioning means that there may always be an unforeseen psychological or biological factor.
2.2 Psychological Treatments of Sexual Dysfunctions in Men 2.2.1 Approaches to treatment. The most important transformation of the treatment of sexual dysfunctions occurred after the publication of Masters and Johnson’s (1970) Human Sexual Inadequacy. First of all, they brought sex into the treatment of sexual problems. Before the publication of their seminal book, sexual problems were conceived as consequences of (nonsexual) psychological conflicts, immaturity, and relational conflicts. In most therapies for sexual problems sex was not a topic in the therapeutic transactions. There were always things ‘underlying,’ ‘behind,’ and ‘besides’ the sexual symptoms that deserved discussion. Masters and Johnson proposed to attempt directly to reverse the sexual dysfunction by a kind of graded practice and focus on sexual feelings. If sexual arousal depends directly on sexual stimulation, that very stimulation should be the topic of discussion. Here the second important transformation occurred: A sexual dysfunction was no longer something pertaining to an individual; rather, it was regarded as a dysfunction of the couple. It was assumed they did not communicate in a way that allowed sexual arousal to occur when they intend to ‘produce’ it. Masters and Johnson thus
initially considered the couple as the ‘problem’ unit. Treatment goals were associated with the couple concept: The treatment goal was orgasm through coital stimulation. This connection between treatment format and goals was lost once Masters and Johnson’s concept was used in common therapeutic practice. People came in for treatment as individuals. Male orgasm through coitus adequately fulfills reproductive goals, but it is not very satisfactory for many women because they do not easily achieve orgasm through coitus. What has remained over the years, since 1970, is a direct focus on dysfunctional sex and a focus on sexual sensations and feelings as a vehicle for reversal of the dysfunction. What Masters and Johnson tried to achieve in their treatment model is a shift in their patients’ focus of attention. Let us look at one of Masters and Johnson’s interventions to elucidate this point. People with sexual dysfunctions tend to wait and look for the occurrence of feelings, instead of feeling what occurs—hence, the spectator role. Their attention is directed towards something that is not there or does not exist, which is frustrating. In simplest form, Masters and Johnson propose to redirect attention by using the following steps: first of all they manipulate expectations by instructing the patient about what is allowed to occur and what is not. It is explained to the patient that nonsexual feelings are to be accepted as a way to accept sexual feelings later on, and therefore sexual areas are excluded in the initial homework tasks. From a psychological point of view this manipulation is ingenious; it directs attention away from sex—when you feel a caress on your arm it may be pleasant but (now) it is not sexual—however, at the same time, it defines sexual feelings as feelings in ‘sexual areas.’ To attain a direct approach of sexual function numerous variants of couple, communication, and group therapy have been used. Rational emotive therapy has been used to change expectation and emotions (see Behaior Psychotherapy: Rational and Emotie). To remedy biographical memories connected with sexual dysfunction, psychoanalytic approaches have been used as well as cognitive behavior therapy approaches (see Behaior Therapy: Psychological Perspecties) . There are specific interventions for some dysfunctions; for example, premature ejaculation has been treated with attempts at heightening the threshold for ejaculatory release (stop–start or squeeze techniques). Recently, as a spin-off of research into cardiac vascular smooth muscle pharmacology, drugs have become available which act by relaxing smooth muscles in the spongiose and cavernous structures in the penis. This relaxation is necessary to allow blood flow into the penis, thus causing an erection. Some of these drugs (e.g., sildenafil—Viagra2) support the natural neurophysiological reaction to sexual stimuli. Others act locally in the penis without sexual stimu13967
Sex Therapy, Clinical Psychology of lation. To slow down speed of ejaculation in men with premature ejaculation SSRIs (serotonin selective reuptake inhibitors) seem to be helpful. Smooth muscle relaxants like sildenafil are helpful in older men when less naturally occuring transmitters are available. Men with vascular or neurodegenerative diseases (e.g., diabetes, multiple sclerosis) may also benefit from the use of smooth muscle relaxants. Although these drugs are very effective, they do not help every man with erectile problems. Pharmacological treatment for erectile disorder may be an important step in restoring sexual function. Most couples will need information and advice to understand what they may expect from this treatment. For many it will not bring the final resolution of their relationship problems. In addition to drug treatment they will need some form of sex therapy or psychotherapy. 2.2.2 Validated treatments for male sexual dysfunctions. It has been difficult to get an overview of treatments for sexual dysfunctions because any proposal about how to approach dysfunctions was valid. This has changed through the introduction of criteria for validated or evidence-based practice by the American Psychological Association (APA). From the timely review by O’Donohue et al. (1999) of psychotherapies for male sexual dysfunctions it appears that the state-of-the-art is far from satisfactory. Following the criteria of APA’s Task Force, they found no controlled outcome studies for male orgasmic disorder, sexual aversion disorder, hypoactive sexual desire disorder, and dyspareunia in men. For premature ejaculation and for erectile disorder there is evidence for the usefulness of psychological treatment. But effects are limited and often unstable over time. Although the evidence-based practice movement should be firmly supported, unqualified support would be disastrous for the practice of the treatment of sexual problems. The care for patients with sexual problems must be continued even without proof according to the rules of ‘good clinical practice.’ The sensible clinician will learn to be very careful about any claims concerning either diagnostic procedures or treatments.
3. Sexual Dysfunctions in Women 3.1 Diagnostic Methods Similar to the procedures in men, initial interviews should help the clinician in formulating the problem and in deciding whether sex therapy is indicated. Since sexual problems can be a consequence of sexual trauma it is necessary to ask if the woman ever experienced sexual abuse. An important issue is the agreement between therapist and patient about 13968
the formulation of the problem and the nature of the treatment. To reach a decision to accept treatment, the patient needs to be properly informed about what the diagnosis and the treatment involve. Dependent on the nature of the complaint, the initial interviews may be followed by medical assessments. In contrast to the assessment of men, the psychobiological assessment of women’s sexual problems is not well developed.
3.2 Psychological Treatments of Sexual Dysfunctions in Women 3.2.1 Approaches to treatment. Similar to men, the treatment of sexual dysfunction in women contains many elements from Masters and Johnson sex therapy. As noted before, an important addition, especially in women, is the use of masturbation to discover their own sexuality. Low sexual desire is generally treated with sensate focus exercises to minimize performance pressure, and communication training. In the treatment of sexual aversion the focus is on decreasing anxiety, the common core of sexual aversions. Behavioral techniques, like exposure, are most commonly used. Treatment of sexual arousal disorder generally consists of sensate focus exercises and masturbation training, with the emphasis on becoming more selffocused and assertive in asking for adequate stimulation. For the treatment of primary (lifelong) anorgasmia there exists a well-described treatment protocol. Basic elements of this program are education, self-exploration and body awareness, and directed masturbation. Because of the broad range of problems behind the diagnosis of secondary (not lifelong) and situational anorgasmia, there is no major treatment strategy for this sexual disorder. Dependent on the problem, education, disinhibition strategies, and assertiveness training are used. It is important to identify unrealistic goals for treatment like achieving orgasm during intercourse without clitoral stimulation. For dyspareunia (genital pain often associated with intercourse), there are multiple possible somatic and psychological causes. A common picture is vulvar vestibulitis, pain at small inflamed spots at the lower side of the vaginal opening. However, often there is no clear organic cause for the pain. Treatment should be tuned to the specific causes diagnosed and can vary from patient to patient. Behavioral interventions typically include prohibition of intercourse, finger exploration of the vagina, first by the woman, then by her partner. Sensate-focus exercises may be used to increase sexual arousal and sexual satisfaction. Pelvic floor muscle exercises and relaxation training can be recommended in case of vaginismus or a high level of muscle tension in the pelvic floor.
Sexual Attitudes and Behaior Treatment of vaginismus commonly involves exposure to vaginal penetration by using dilators of increasing size or the women’s fingers. Pelvic floor muscle exercises may be used to provide training in discrimination of vaginal muscle contraction and relaxation, and to teach voluntary control over muscle spasm. Pharmacological treatment of sexual disorders of women is just beginning. Smooth muscle relaxants have been used in women to ameliorate sexual arousal and as a consequence hypoactive sexual desire. It appears that drugs like sildenafil produce smooth muscle relaxation and increased genital blood flow, but they have no effect on the subjective experience of sexual response. In women with subphysiological levels of testosterone—mainly in the postmenopause—testosterone patches appear to have an effect on mood, energy, and libido. 3.2.2 Validated treatments for women’s sexual dysfunctions. Reviews of treatments for sexual dysfunctions in women following the criteria for validated or evidence-based practice have been published (O’Donohue et al. 1997, Heiman and Meston 1997, Baucom et al. 1998). Heiman and Meston conclude that treatments for primary anorgasmia fulfil the criteria of ‘well-established,’ and secondary anorgasmia studies fall into the ‘probably efficacious’ group. They conclude with some reservations that vaginismus appears to be successfully treated if repeated practice with vaginal dilators is included in the treatment. Their reservations are due to a lack of controlled or treatment comparison studies of vaginismus. All authors conclude that adequate data on the treatment of sexual desire disorder, sexual arousal disorder, and dyspareunia is lacking. Although the evidence-based practice movement deserves support, care for patients with sexual problems must be continued even without proof according to the rules of ‘good clinical practice.’
4. Future Directions Sex therapy bloomed in the 1970s and 1980s, but reviews of evidence-based treatments suggest that developments stagnated and very few new studies have been undertaken. The recent shift to biological approaches will continue, at least for a while. Viagra and testosterone patches will shortly be followed by more centrally acting drugs (e.g., dopamine agonists). The search for drugs has provoked a wide range of studies into the biological basis of sexual function. This work inspires behavioral and cognitive neuroscience studies, which may provide a framework and new tools to better understand sexual emotions and sexual motivation. See also: Psychological Treatments, Empirically Supported; Sexuality and Gender
Bibliography Baucom D H, Shoham V, Mueser K T, Daiuto A D, Stickle T R 1998 Empirically supported couple and family interventions for marital distress and adult mental health problems. Journal of Consulting and Clinical Psychology 66: 53–88 Heiman J R, Meston C M 1997 Evaluating sexual dysfunction in women. Clinical Obstetrics and Gynecology 40: 616–29 Janssen E, Everaerd W 1993 Determinants of male sexual arousal. Annual Reiew of Sex Research 4: 211–45 Kaplan H S 1995 The Sexual Desire Disorders: Dysfunctional Regulation of Sexual Motiation. Brunner\Mazel, New York Kolodny R C, Masters W H, Johnson V E 1979 Textbook of Sexual Medicine. 1st edn., Little, Brown, Boston Laan E, Everaerd W 1995 Determinants of female sexual arousal: Psychophysiological theory and data. Annual Reiew of Sex Research 6: 32–76 Laumann E O, Paik A, Rosen R C 1999 Sexual dysfunction in the United States: Prevalence and predictors. Journal of the American Medical Association 281: 537–44 Masters W H, Johnson V E 1970 Human Sexual Inadequacy. 1st edn., Little, Brown, Boston O’Donohue W T, Dopke C A, Swingen D N 1997 Psychotherapy for female sexual dysfunction: A review. Clinical Psychology Reiew 17: 537–66 O’Donohue W T, Geer J H (eds.) 1993 Handbook of Sexual Dysfunctions: Assessment and Treatment. Allyn and Bacon, Boston O’Donohue W T, Swingen D N, Dopke C A, Regev L V 1999 Psychotherapy for male sexual dysfunction: A review. Clinical Psychology Reiew 19: 591–630
W. Everaerd
Sexual Attitudes and Behavior The focus in this article is on changes in premarital, homosexual, and extramarital sexual attitudes and behaviors as revealed in nationally representative surveys during the second half of the twentieth century. Attention will be mainly on shared sexual attitudes and behaviors in various societies. A shared attitude is a cultural orientation that pressures us to have positive or negative feelings toward some behavior, and to think about that behavior in a particular way.
1. Sexual Attitudes: Premarital Sexuality Findings from various countries will be examined and compared but because of the large number of national surveys conducted in the USA, we will start there. The first national representative sample of adults in the USA that used a series of scientifically designed questions to measure premarital sexual attitudes was completed in 1963 (Reiss 1967). The National Opinion Research Center (NORC) at the University of Chicago was contracted to do the survey. Reiss composed 24 questions about premarital sexual relationships that 13969
Sexual Attitudes and Behaior formed two unidimensional scales and several subscales (Reiss 1967, Chap. 2). The highest acceptance was to the question asking about premarital coitus for a man when engaged. Even on this question only 20 percent (30 percent of males and 10 percent of females) said they agreed that such premarital coitus was acceptable (Reiss 1967, p. 31). The date of this survey (early 1963) was strategic because it was at the start of the rapid increase in premarital sexuality that came to be known as the sexual revolution, and which soon swept the USA and much of the western world. Two years later, in 1965, NORC fielded a national survey that contained four questions on premarital sexual attitudes. One of the questions also asked about the acceptance of premarital coitus for males when engaged. Scott (1998) reports an average acceptance rate of 28 percent, consisting of 37 percent of males and 21 percent of females in this survey. Acceptance was designated by a person checking either of the last two categories (‘wrong only sometimes’ and ‘not wrong at all’), and disapproval was indicated by a person checking either of the first two categories (‘always wrong’ and ‘almost always wrong’). The 20 percent acceptance rate in the 1963 national survey and the 28 percent rate in this 1965 survey lends credence to a rough estimate that at the start of the sexual revolution in premarital sex (1963–5), about one quarter of adults in the USA accepted premarital coitus while three quarters disapproved of it. In 1970 Albert Klassen and his colleagues at the Kinsey Institute conducted the next nationally representative survey (1989). Klassen also used the NORC to conduct his survey. He used four questions and asked about the acceptability of premarital intercourse for males and females, when in love and when not in love. Reiss reported that the responses to questions about males specifying love and those specifying engagement, were less than 2 percent apart, and so the Klassen question about premarital sex for males in love can be compared to the two previous surveys. Klassen reports that 52 percent (60 percent of males and 45 percent of females) chose the two acceptant categories of ‘wrong only sometimes’ or ‘not wrong at all’ (Klassen et al. 1989, p. 389). In just seven years acceptance of coitus rose from 20 percent in 1963, 28 percent in 1965, to 52 percent in 1970. The questions used in all three national surveys seem comparable and, most importantly, the size of the difference from 1963 to 1970 is so large, that it is hard not to conclude that in those seven years, something that can be called a revolution began to evidence itself in American attitudes toward premarital coitus. Reiss considered the increased autonomy of females and young people as the key factor in the increase of premarital sexual attitudes and developed the Autonomy Theory explanation of the sexual revolution around this concept (Reiss 1967, Chap. 10, Reiss and Miller 1979). In the l960s a higher percentage of females were employed than ever before and this 13970
meant more autonomy for them, and more autonomy for their children from parental controls and indoctrination. Females more than males were impacted by this increased autonomy and the three studies show much greater proportionate changes in female attitudes than in males. Adult males’ acceptance doubled from 30 percent in 1963 to 60 percent in 1970, whereas adult females’ acceptance increased 3.5-fold from 10 percent to 45 percent during the same period. The autonomy theory predicted that since female autonomy was increasing the most, female premarital permissiveness would increase the most. This is precisely what happened. Starting in 1972 the NORC introduced the General Social Survey (GSS) to gather national data annually or biennially on adults in the USA concerning premarital sexuality and a wide range of other nonsexual attitudes and behaviors. These data afford a basis here to examine the change in premarital sexual attitudes from 1972 to 1998. Unfortunately, the GSS researchers did not ask a question modeled after that in the l963, 1965, and 1970 surveys, which all specified gender and the presence of love or engagement. The GSS question basically taps a respondent’s global response to the acceptability of premarital coitus. It asked: ‘If a man and a woman have sex relations before marriage, do you think it is always wrong, almost always wrong, wrong only sometimes or not wrong at all’ (Davis and Smith 1999, p. 235). The ‘always wrong’ response to the GSS question is the only response that clearly excludes acceptance of coitus even for an engaged or in love male and so it will be considered as the response indicating rejection of such behavior. The 1972 GSS survey reported that 63 percent accepted premarital coitus under some condition and 37 percent checked the ‘always wrong’ category. That was a significant growth from the 52 percent acceptance reported by Klassen for 1970. By 1975 the GSS surveys reported that acceptance had risen to 69 percent. The rise was small after 1975 and acceptance was 74 percent in 1998 (Davis and Smith 1999, p. 235). Looking at all the national surveys in the USA from 1963 to 1998 the evidence is that the period of most rapid change in acceptance of premarital coitus was from 1963 to 1975, with an overall increase from 20 percent to 69 percent. That is the period that we can most accurately label as a premarital sexual revolution. In all the surveys discussed the data showed that females, much more than males, increased their acceptance of premarital coitus and this led to more gender equality in attitudes towards premarital coitus. The gender comparisons in 1963 were 10 percent female acceptance to 30 percent male acceptance. By 1975 the comparisons were 65 percent female acceptance to 74 percent male acceptance. Studies in a number of European societies indicate that even after the increased acceptance of premarital coitus in the USA, many Western countries were still more acceptant than the USA. For example, using
Sexual Attitudes and Behaior data from the International Social Survey Program (ISSP) of 1994, comparing the USA to five other societies, Scott reports that only Ireland was less acceptant of premarital coitus than the USA. Germany and Sweden were much more acceptant, and even Britain and Poland were significantly more acceptant (Scott 1998, p. 833). Unlike in the USA, Scott reports increases in acceptance of premarital coitus continuing in Britain during the 1980s and 1990s. There are other national surveys that can be studied such as the 1971 and 1992 national surveys in Finland that show similar trends to what was found in the USA (Kontula and Haavio-Mannila 1995, Chap. 12). Frequent mention of comparable changes in premarital sexual attitudes can be found in the International Encyclopedia of Sexuality in its accounts of 31 societies (Francoeur 1997). There are also a number of other European countries with national surveys taken in the 1980s and 1990s but lacking earlier national surveys for comparison. Nevertheless, what evidence we have on these other societies seems to support a significant increase in the acceptance of premarital coitus similar to what was happening in the USA, although not necessarily in the exact same years.
2. Sexual Attitudes: Homosexuality In 1973, a question on homosexual behavior was first asked in the GSS national survey in the USA. No distinction was made between male and female homosexuality. Nineteen percent accepted homosexual behavior as ‘wrong only sometimes’ or ‘not wrong at all.’ That did not vary a great deal until the 1993 GSS survey where acceptance jumped to over 29 percent. It then rose to 34 percent in 1996 and to 36 percent in 1998 (Davis and Smith 1999). One can only speculate as to why in the early 1990s this change accelerated in the USA. Perhaps the changes in the 1980s toward greater civil rights for homosexuals encouraged the increase in acceptance of homosexual behavior itself. However changes in the intensity of feelings cannot be indicated by the simple percent distribution on the GSS question on homosexuality. Clearly, using just one question to measure a sexual attitude, while useful, does not afford us sufficient information. There are national data on homosexuality from other countries. Inglehart, using his two World Value Surveys, compared 20 societies in 1981 and 1990 on the question of homosexuality. He reported that in all but three of the 20 (Ireland, Japan, South Africa) there was an increase in acceptance of homosexuality between 1981 and 1990 (Inglehart 1997, p. 279). In addition, the ISSP (1994) reported that Poland and Ireland were less acceptant of homosexuality than the USA, whereas Britain, West Germany, East Germany, and Sweden were more acceptant (Scott 1998, p. 833). When we compare changes in males and females in
the USA using the GSS data we find that females changed more than males in accepting homosexuality. Homosexual attitudes have traditionally been one of the very few sexuality areas where females equal or exceed males in the acceptance of a sexual behavior. This type of male\female difference was commonly found also in Western European countries studied in the World Values Surveys. But this higher female level of acceptance of homosexuality was not typically found in Eastern European or Asian countries (Inglehart 1997).
3. Sexual Attitudes: Extramarital Sexuality Extramarital sexual attitudes present a very different trend from either premarital or homosexual attitudes. The GSS data for the USA show that the acceptance of extramarital sexuality fell significantly between 1973 and 1988. Acceptance of extramarital sexuality (answering ‘wrong only sometimes’ and ‘not wrong at all’) in 1973 was 16 percent but by 1988 this had dropped to only 8 percent. Although male acceptance stayed higher than that of females, both genders showed close to a 50 percent decrease in acceptance over that period. The fear of HIV\AIDS may have played a role. Negative experiences with extramarital sexuality during the era of rapidly increasing divorce rates in the 1970s may also have contributed to this conservative trend. This more conservative shift in extramarital attitudes was seen in only a minority of the 20 countries on which Inglehart presents data for the period 1981–90 (Inglehart 1997, p. 367). France, Northern Ireland, Sweden, Argentina, and South Africa showed the sharpest decreases in their acceptance of extramarital sexuality. Meanwhile, Mexico, Italy, Finland, and Hungary evidenced the strongest changes toward greater acceptance of extramarital sex. This finding presents a most interesting puzzle as to why some countries changed to be more restrictive while others became more acceptant of extramarital sexuality. Adding more to this puzzle is the finding by Inglehart that from 1981 to 1990, 16 (of 19) countries increased their belief that a child needs two parents to be happy (1997, p. 287). Thus, there was clearly more agreement in most of these countries on the importance of the two-parent family than on extramarital sexuality. It must be that in some countries, extramarital sexuality was not seen as a challenge to the stability of the two-parent family. There is a need here to also study the various types of extramarital coital relationships to distinguish the impact on stable relationships of having a casual vs. a love affair and\or a consensual vs. a nonconsensual affair (Reiss 1986, Chap. 3). These sexual complexities are but one of the many sexual conundrums waiting to be deciphered by more detailed research that can clarify and elaborate the survey research data presented here. 13971
Sexual Attitudes and Behaior
4. Relation of Sexual Behaiors and Sexual Attitudes The relation of sexual behavior to the attitude changes we have noted can be explored in several national studies in the USA. The 1982 National Survey of Family Growth (NSFG) interviewed females 15 to 44 years old. These 15–44 year old females were asked about their first coitus, and from this we can obtain retrospective reports for those in this sample who were teenagers in the early 1960s. Hopkins reports that the average percent nonvirginal for 15–19 year old females in 1962 was 13 percent and this rose to 30 percent by 1971 (Hopkins 1997). Zelnik and Kantner (1980) undertook three nationally representative samples of teenage females in the USA during the 1970s. They report that the percentage of 15–19 year old females who had experienced coitus rose from 30 percent in 1971 to 43 percent in l976, and finally to 50 percent in 1979 (Zelnik and Kantner 1980). The 50 percent rate dropped to 45 percent in 1982 and rose back to about 50 percent in 1988 and has stayed close to that level since then (Singh and Darroch 1999, Reiss 1997, Chap. 3). There were many other changes in teenage sexual relationships, such as improved contraception, that cannot be discussed here. Finally, it should be noted that premarital coital behavior, just like premarital attitudes, changed much more for females than for males (Laumann et al. 1994). When we compare these behavioral changes with the attitudinal changes discussed above, it is apparent that the large increase in the acceptance of premarital coitus starting in the 1960s was very much in line with the contemporaneous increases in teenage coital behavior. Using the GSS surveys allows one to compare attitudes with behavior in all three areas of premarital, homosexual, and extramarital for specific years. These data show a relatively close relationship between attitudes and behaviors in all three areas. For example, among those who said premarital sex is ‘always wrong’ 32 percent had premarital coitus in the last year, while among those who said premarital coitus was ‘not wrong at all,’ 86 percent had premarital coitus in the last year (Smith 1994, p. 89). The comparable figures for homosexuality are 1 percent vs. 15 percent, and for extramarital sex 2 percent vs. 18 percent. These are very large differences and they support the interactive relationship of attitudes and behavior in our sexual lives which others have commented upon (Reiss 1967 Chap. 7, Klassen et al. 1989, p. 253). Many of the same countries that were noted for increases in sexual attitudes since the 1960s also have national data supporting changes in sexual behavior, particularly in premarital sexuality (Inglehart 1997). A large national survey in England presents data supporting a very close association of attitudes and behaviors in premarital, homosexual, and extramarital sexuality (Johnson et al. 1994, p. 245). Also, in the area of homosexuality, Laumann reports similarities in his 13972
1992 American data concerning homosexuality with data from other national surveys (Laumann et al. 1994). Finally, Francouer’s encyclopedic work written by experts from 31 societies also supports a close connection between attitude changes and behavior changes in many of the countries studied (Francoeur 1997). The puzzle concerning how, and in what temporal sequences attitudes and behavior influence each other is one that requires careful research and theoretical attention.
5. Conclusions The representative national surveys examined leads to several important conclusions about sexual attitudes and behaviors in the post Kinsey era. It seems clear that there has been a sexual revolution in the area of premarital sexuality in the USA and in a large number of other Western countries. This is evidenced in both attitudes and behaviors and more strongly on the part of females than males. There has also been a more moderate increase in the acceptance of homosexuality. Finally, it was found that the acceptance of extramarital sexuality has actually decreased in the USA and elsewhere while increasing in a few other countries. There is a dearth of theories regarding why such sexual changes have occurred. The autonomy theory argues that the key variable in premarital sexual attitude change is the rise in autonomy (Reiss 1967, Reiss and Miller 1979). Such a change is also part of increased social acceptance of gender equality and of premarital sexuality, particularly in a gender equal relationship. The extensive examination by Hopkins of national data in the USA to test the autonomy theory’s ability to explain premarital sexuality trends from 1960–90, strongly supports the increases in female gender equality as a key determinant of changes in autonomy, which in turn produced changes in premarital sexual attitudes and behaviors (Hopkins 1997, Chap. 6).
6. Future Directions Reiss has delineated the nature of a new sexual ethic that he finds is increasingly popular in many countries in the Western world. He calls this new ethic, HER Sexual Pluralism, meaning that the moral yardstick in a sexual relationship is now the degree of Honesty, Equality, and Responsibility present. The older norms that judged people by whether they had performed a specific sexual behavior, have increasingly been replaced by focusing on the HER relationship parameters (Reiss 1997, Chaps. 1 and 10). This new ethic fits with the increased acceptance of premarital and homosexual sexuality, for the HER ethic does not use
Sexual Attitudes and Behaior marriage as the Rubicon of good and bad sexuality. The drop in the acceptance of extramarital sexuality in many countries may reflect the difficulty in carrying out two HER relationships simultaneously. HER sexual pluralism is well integrated with the more gender equal type of evolving Western society and is predicted to become the dominant sexual ethic of the twenty-first century. One other theoretical explanation of sexual trends comes from Inglehart (1997). He postulates that as capitalist societies become more affluent and increasing numbers of their citizens feel secure, there has occurred a rise in ‘non materialist’ values. These new values stress well being and quality of life, over accumulation of more economic wealth. Inglehart argues that as part of this major emphasis on quality of life, we are witnessing a liberation and pluralization of sexual values, particularly in the premarital and homosexual areas. Inglehart’s thesis is quite compatible with the autonomy theory because both theories are placing the birth of the changes in sexuality as a key part of the development of a new type of society; a society that is more autonomous and more concerned with the quality of life, than with economic survival. The fact that young people in these societies evidence these sexuality trends more than older people lends further support to the future growth of these changes. There appears to be a change in our basic social institutions in much of the world, and with that, a change in sexual relationships is occurring in line with the emerging HER sexual pluralism ethic. Many aspects of sexual attitudes and behaviors could not be discussed in this article. Even the surveys in the three areas examined illustrate the need for a more detailed examination of the many nuances of sexuality in each area. In addition, sexual science requires better coverage of peoples outside the Western World (Barry and Schlegel 1980, Reiss 1986). We can make progress by combining qualitative and quantitative methods in our work and linking those many disciplines that study sexuality. One way that significant progress can be encouraged is by the establishment of a multidisciplinary Ph.D. degree in sexual science, and in the Spring of 1999 the Kinsey Institute started work to produce just such a degree program (Reiss 1999). This program will be immensely helpful in expanding sexual science’s ability to explain our sexual lives. Theory is explanation, and science without theory is just bookkeeping. In all science we need to know why something is the way we find it, not just describe what is found. With the growth of our scientific explanations we will be better able to contain the myriad of sexual problems that plague sexual relationships worldwide. See also: Gay, Lesbian, and Bisexual Youth; Gay\ Lesbian Movements; Prostitution; Rape and Sexual Coercion; Rationality and Feminist Thought; Reg-
ulation: Sexual Behavior; Reproductive Rights in Affluent Nations; Sexual Behavior and Maternal Functions, Neurobiology of; Sexual Behavior: Sociological Perspective; Sexual Orientation: Historical and Social Construction; Sexuality and Gender
Bibliography Barry H, Schlegel A (eds.) 1980 Cross-Cultural Samples and Codes. University of Pittsburgh Press, Pittsburgh, PA Davis J A, Smith T W 1999 General Social Sureys, 1972–1998. University of Connecticut, Storrs, CT Francoeur R T 1997 The International Encyclopedia of Sexuality. Continuum Publishing, New York, 3 Vols. Hopkins K W 1997 An explanation for the trends in American teenagers’ premarital coital behavior and attitudes between 1960–1990. Unpublished doctoral dissertation, University of Minnesota, Minneapolis, MN Inglehart R 1997 Modernization and Postmodernization: Cultural, Economic, and Political Change in 43 Societies. Princeton University Press, Princeton, NJ Johnson A M, Wadsworth J, Wellings K, Field J, Bradshaw S 1994 Sexual Attitudes and Lifestyles. Blackwell Scientific Publications, Oxford, UK Klassen A D, Williams C J, Levitt E E 1989 Sex and Morality in the US. Wesleyan University Press, Middletown, CT Kontula O, Haavio-Mannila E 1995 Sexual Pleasures: Enhancement of Sex Life in Finland, 1971–1992. Dartmouth Publishing, Aldershot, UK Laumann E O, Gagnon J H, Michael R T, Michales S 1994 The Social Organization of Sexuality. University of Chicago Press, Chicago Reiss I L 1967 The Social Context of Premarital Sexual Permissieness. Holt, Rinehart & Winston, New York Reiss I L, Miller B C 1979 Heterosexual permissiveness: a theoretical analysis. In: Burr W, Hill R, Nye I, Reiss I (eds.) Contemporary Theories About the Family. Free Press of MacMillan, New York, Vol. 1 Reiss I L 1986 Journey Into Sexuality: An Exploratory Voyage. Prentice-Hall, Englewood Cliffs, NJ Reiss I L 1997 Soling America’s Sexual Crises. Prometheus Books, Amherst, NY Reiss I L 1999 Evaluating sexual science: Problems and prospects. Annual Reiew of Sex Research 10: 236–71 Scott J 1998 Changing attitudes to sexual morality: A crossnational comparison. Sociology 32: 815–45 Singh S, Darroch J E 1999 Trends in sexual activity among adolescent American women: 1982–1995. Family Planning Perspecties 31: 212–19 Smith T W 1994 Attitudes toward sexual permissiveness: Trends, correlates, and behavioral connections. In: Rossi A S (ed.) Sexuality Across the Life Course. University of Chicago Press, Chicago Zelnik M, Kantner J 1980 Sexual activity, contraceptive use and pregnancy among Metropolitan-area teenagers: 1971–1979. Family Planning Perspecties 12: 230–7
I. L. Reiss Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
13973
ISBN: 0-08-043076-7
Sexual Behaior and Maternal Functions, Neurobiology of
Sexual Behavior and Maternal Functions, Neurobiology of During the second half of the twentieth century, interdisciplinary research efforts have generated considerable information about the physiological mechanisms that control mammalian reproductive behaviors. The approach has used field and laboratory studies with animals to elucidate general principles that are beginning to be assimilated by the social sciences as they toil to understand the sexual and parental behaviors of our own species. What follows is a review of basic research on sexual behavior, sexual differentiation, and maternal functions in mammals including humans.
1. Sexual Behaior Mammalian sexual behavior is facilitated in males and females by the hormonal secretions of the testes and ovaries, respectively. In females the display of sexual behavior shows cycles closely associated with fluctuations in the hormonal output of the ovaries. In many species, including rodents commonly used in laboratory experiments, the cyclic display of behavior includes both changes in the ability of the female to copulate as well as her motivation or desire to engage in sexual behavior. In these species intercourse is often physically impossible except during a brief period of optimal hormonal stimulation of central and peripheral tissues. The central effects of ovarian hormones (i.e., estrogen (E) and progesterone (P)) in the facilitation of female sexual behavior are mediated by receptors found in several brain regions including the ventromedial nucleus of the hypothalamus (VMH), the preoptic area, and the midbrain central gray. The effects of E and P on neurons of the VMH appear to be sufficient to facilitate the display of the postural adjustment necessary for copulation in female rats (i.e., the lordosis reflex), and the display of lordosis is prevented by lesions of the VMH (Pfaff et al. 1994). Female rats with VMH lesions can show lordosis if other neural systems that normally inhibit the display of lordosis are also surgically removed (Yamanouchi et al. 1985). The rewarding or reinforcing aspects of female copulation are likely to be mediated by dopaminergic systems that include the nucleus accumbens. The rewarding aspects of female sexuality appear to be activated only when the female can control when and how often she has contact with the male during mating, i.e., when the female can ‘pace’ the copulatory encounter (Erskine 1989). In sharp contrast with nonprimate species, female monkeys are capable of engaging in sexual behavior at all phases of the ovulatory cycle and after ovariectomy. This fact which of course also applies to women, has been often used to question the importance of ovarian hormones in the modulation of female sexuality in 13974
primates. Other work, however, has shown that in spite of having the ability to copulate throughout the menstrual cycle, female monkeys show fluctuations in sexual motivation that are predictable from the pattern of ovarian E production across the cycle. For the behavioral effects of E to be evident, female monkeys must be tested in situations in which they are given the opportunity to choose between avoiding contact with the male or alternatively to approach the male and solicit his attention. In a series of elegant experiments conducted under naturalistic conditions Wallen and associates (Wallen 1990, 2000) have shown that in rhesus females, the willingness to leave an all-female group in order to approach a sexually active male peaks at the time of maximal E production at the end of the follicular phase. Further, under the same conditions, females never approach the male if their ovarian functions are suppressed by pharmacological manipulations. As argued by Wallen, in rhesus females and perhaps also in women, ovarian hormones do not determine the ability to engage in sexual behavior but have salient effects on sexual desire. Almost nothing is known about where E acts in the brain to facilitate sexual motivation in female primates. In male mammals, castration results in a reduction in sexual behavior and replacement therapy with testosterone (T) or its metabolites restores the behavior to pre castration levels. In nonprimate species both the motivational and performance aspects of male sexuality are affected by lack of T. In men and possibly other primates, lack of T seems to affect sexual motivation more than sexual ability (Wallen 2000). Thus, men with undetectable circulating levels of T can reach full erections when shown visual erotic materials, but report little sexual interest in the absence of T replacement. In men, erectile dysfunction is often due to non endocrine causes such as vascular pathologies or damage to peripheral nerves. In the brain T seems to act primarily in the medial preoptic area (MPOA) to facilitate male sexual behavior, but it is evident that other brain regions as well as the spinal cord and other peripheral sites need hormonal stimulation for the optimal display of sexual behavior in males. Lesions of the MPOA have immediate and profound disruptive effects on male sexual behavior and these effects are remarkably consistent across species. Lesions of the MPOA however do not equally affect all components of male sexual behavior and some of the effects of the lesions are paradoxical. For example, in male rats lesions of the MPOA that virtually abolish male sexual behavior do not seem to affect willingness to work on an operant task when the reward is access to a receptive female. Similarly, male mice continue to show courtship behavior directed to females after receiving large lesions of the MPOA. It has been suggested that lesions of the MPOA selectively affect consummatory aspects of male behavior leaving appetitive or motivational components intact. Not all the data fit this proposed dichotomy. For
Sexual Behaior and Maternal Functions, Neurobiology of example rhesus monkeys that show copulatory deficits after MPOA damage do not lose all the consummatory aspects of male behavior; after the lesions the animals are capable of achieving erections and they frequently masturbate to ejaculation (Nelson 2000). The view that male sexual behavior results primarily from T action in the MPOA and female sexual behavior from E and P action in the VMH is not without detractors. In a recent study of male sexual behavior in the rat, it was found that androgen antagonists implanted directly into the MPOA did block male sexual behavior as would be expected. However, the most effective site for androgen antagonist blockade was in the VMH! (McGinnis et al. 1996). There is also considerable overlap between the neural systems that are active during female sexual behavior and those that are active in the male during copulation. These studies may indicate that the appetitive aspects of sexual behavior, seeking out a partner, courtship, etc. are under the control of similar neural systems in both sexes (Wersinger et al. 1993).
2. Sexual Differentiation In addition to the activational effects gonadal hormones have on adult sexual behavior, there is an extensive literature base to show that gonadal hormones also affect the development of brain systems that regulate male and female sexual behavior. These long-lasting effects are often referred to as ‘organizational effects’ to distinguish them from the concurrent facilitative actions that gonadal hormones have in the adult. Organizational hormone actions are thought to occur primarily during the period of sex differentiation of the nervous system. In species with short gestation, such as rats and hamsters, this developmental period occurs around the time of birth, whereas in those species with longer gestation times, such as primates, sex differentiation of the nervous system takes place during fetal development (Nelson 2000). Normal males are exposed to androgens throughout this early developmental period as a result of the testes becoming active during fetal development. When genetic female rats were treated with T throughout this time they showed significant masculinization of behavior and genital morphology. As adults, these experimental females displayed most of the elements of male sexual behavior including the ejaculatory reflex. When males were deprived of androgens during this time they were less likely to show the consummatory responses associated with masculine copulation. Further, in the absence of androgens during early development, male rodents develop into adults who show most of the elements of female sexual behavior if treated with ovarian hormones. For many laboratory rodents the behavioral masculinizing effects of T treatment result from metabolites of T, namely estradiol and reduced androgens such as dihydrotesto-
sterone. There is good evidence that for some species it is the estrogenic metabolite that is essential for behavioral masculinization. Ovarian hormones do not appear to play a role in the development of the neural systems that underlie feminine sexual behavior in mammals. When female laboratory rodents, such as the golden hamster, are treated with estrogen during early development they actually show reduced levels of female sexual behavior as adults. However, their levels of male-like behavior, such as mounting receptive females, are increased, findings that are consistent with the concept that estrogen is a masculinizing metabolite of T in males. The findings on the effects of gonadal hormones during early development of the female are not easily interpreted, because an independent definition of just what is feminine and what is masculine is not available. On the one hand it is clear that if a female rodent is exposed to high levels of androgen throughout early development she will develop a male-like anatomy and will, as an adult, show all the elements of male sexual behavior. On the other hand, female rodents are often exposed to low levels of androgens normally during gestation as a result of developing next to a male in the uterus. These females are often more dominant than other females and less attractive to males (VomSaal 1989). But they still copulate as females and reproduce, suggesting that normal variations in female phenotype may result from normal variation in androgen exposure during development. This variability in female behavior and attractiveness may be an important part of any normal female population. The problem lies in defining the limits of normal female variation resulting from androgen (or estrogen) exposure, as opposed to what androgen effects might be interpreted as masculinization (Fang and Clemens 1999). The criteria for making this distinction have not yet been defined. Numerous sex differences have been reported for the mammalian nervous system and most of these probably result from the differential exposure to gonadal hormones that occurs during sex differentiation. It also presumed that the sex differences in behavior that we have noted result from these differences in the nervous system of males and females, but few models are available to show a strong correlation between sex differences in the CNS and sex differences in behavior (but see Ulibarri and Yahr 1996). Gonadal hormones can influence the development of the nervous system in a number of ways such as altering anatomical connectivity, neurochemical specificity, or cell survival. For example, while both male and female rats have the same number of nerve cells in the dorsomedial nucleus of the lumbosacral cord prior to sex differentiation, in the absence of androgen many of these cells die in the female, a normal process referred to as ‘apoptosis.’ This differential cell-death rate leaves the adult male rat with a larger dorsomedial nucleus than the female. In some brain regions it is suspected that hormones may actually promote cell death result13975
Sexual Behaior and Maternal Functions, Neurobiology of ing in nuclei that are larger in the female than in the male. There are also numerous examples of sex differences in the peripheral nervous system as well. For many years sex differences in human behavior were regarded as reflections of differences in how male and female children are reared. However, the accumulation of volumes of work showing that sex differences in non humans are strongly influenced by differential hormone exposure has forced scientists from many fields to re-evaluate the nurture hypothesis. Most would probably now agree to several general statements. 1. The brains of male and female humans are structurally very different. 2. These differences may reflect, at least in part, the different endocrine histories that characterize the development of men and women. 3. Some of the behavioral differences between males and females may result from these differences in their nervous systems. Disagreement occurs when we try to specify how much of one trait or another is due to biological factors and how much to experience, and it must be recognized at the outset that a clean separation of nature from nurture is not possible. Most of the evidence for biological factors operating to produce sex differences in human behavior comes from clinical or field psychology studies. A number of syndromes involve variation in androgen or estrogen levels during early development: congenital adrenal hyperplasia (CAH) is a syndrome in which the adrenal gland produces more androgen than normal. Turners Syndrome is characterized by regression of the ovaries and reduced levels of androgen and E from an early age. Hypogonadism is a condition where boys are exposed to lower than normal levels of androgen. There are also populations of girls and boys who were exposed to the synthetic estrogen, diethylstilbestrol (DES) during fetal development as well as populations whose mothers were treated with androgenic-like hormones during pregnancy. A number of studies point to a change in girls play behavior as a result of exposure to androgens during fetal development. CAH girls or girls exposed to exogenous androgenic compounds are often found to show an increased preference for playing with toys preferred by boys and less likely to play with toys preferred by untreated girls. These androgen-exposed girls also are more likely to be regarded as ‘tomboys’ than their untreated sibs or controls and engage in more rough and tumble play than controls (Hines 1993). Some have argued that variation in androgen or E levels during early development may affect sexual orientation, but a general consensus on this point has not been reached at this time.
3. Maternal Functions For rodents, especially in the case of the laboratory rat, the endocrine and neural mechanisms responsible for the onset and maintenance of maternal behavior 13976
are relatively well understood (Numan 1994). This understanding has stemmed to a large extent from the careful description of behaviors shown by maternal rats; these behaviors are easy to identify and to quantify in the laboratory. Such a rich and objective behavioral description is often lacking for other species including our own. In rats, all components of maternal care of the young (except milk production) can be induced in the absence of the hormonal changes that normally accompany pregnancy, parturition, and lactation. Thus, virgin female rats become maternal if they are repeatedly exposed to pups of the right age, a process referred to as sensitization. Hormones, nevertheless, play a critical role in the facilitation of maternal behavior and are necessary for the coordination of maternal care and the arrival of the litter. Using different experimental paradigms, several laboratories have identified E as the principal hormone in the facilitation of maternal behavior. When administered systemically or when delivered directly into the MPOA, E triggers the display of maternal behavior under many experimental conditions. Prolactin from the anterior pituitary or a prolactin-like factor from the placenta also play a role, albeit secondary, in the facilitation of maternal functions; administration of prolactin or analogs of this hormone enhances the activational effects of E on maternal behavior. Also, mice lacking functional prolactin receptors show poor maternal care. Other peptides and steroids have been implicated in the support of maternal functions (Nelson 2000). For example, central infusions of oxytocin facilitate maternal behavior under some conditions and the effects of P on food intake and fuel partitioning are crucial to meet the energetic challenges of pregnancy and lactation (Wade and Schneider 1992). The integrity of the MPOA is necessary for the display of normal maternal behavior. Damage of the MPOA using conventional lesions or chemical lesions that spare fibers of passage interferes with normal maternal behavior under several endocrine conditions. Similar behavioral deficits are seen after knife cuts that interrupt the lateral connections of the MPOA. Cuts that interrupt other connections of the MPOA do not reproduce the behavioral deficits seen after complete lesions (Numan 1994). Normally, male rats do not participate in the care of the young, but exposing males to pups for several days can induce components of maternal behavior. Compared to females, adult males require more days of sensitization with pups and tend to show less robust maternal behavior. This sex difference is not evident before puberty (Stern 1987) suggesting that sexual differentiation of behavior may not be completed until after sexual maturation. When males are induced to care for the pups, the behavior is sensitive to the disruptive effects of lesions of the MPOA. Since MPOA damage affects both male sexual behavior and the display of maternal care, it is possible that there is partial overlap between the neural circuits
Sexual Behaior: Sociological Perspectie that support these two behavioral functions. Alternatively, MPOA lesions may affect more fundamental aspects of the behavioral repertoire of the animals, and such a deficit then becomes evident in different ways under different social conditions. The effects of MPOA lesions in females are also complex and not fully understood. In addition to affecting maternal functions MPOA damage results in a facilitation of the lordosis reflex concurrently with a reduction in the females’ willingness to approach a sexually active male in testing situations that permit female pacing. Finegrained analyses of the functional anatomy of the MPOA are needed to further elucidate the precise role of this area in mammalian reproductive functions. Studies of monogamous mammalian species where males and females remain together after mating offer the opportunity to study paternal care as well as social bonding between parents and between parents and offspring. One promising model is that of the prairie vole in which both the female and the male care for the young (DeVries and Villalba 1999). Studies of these monogamous rodents suggest that bonding of the female to the male and to her young may be enhanced by oxytocin, a peptide hormone synthesized in the hypothalamus and secreted by the posterior pituitary. In the male, paternal care appears to result, not from oxytocin, but from another hypothalamic hormone, vasopressin. In addition to these posterior pituitary hormones investigators have found evidence for a role of the endogenous opiates in strengthening the bond between mother and offspring (Keverne et al. 1999). See also: Queer Theory; Sex Hormones and their Brain Receptors; Sexual Behavior: Sociological Perspective; Sexual Orientation: Biological Influences
Bibliography Carter C S, Lederhendler I I, Kirkpatrick B (eds.) 1999 The Integratie Neurobiology of Affiliation. MIT Press, Cambridge, MA DeVries G J, Villalba C 1999 Brain sexual dimorphism and sex differences in parental and other social behaviors. In: Carter C S, Lederhendler I I, Kirkpatrick B (eds.) The Integratie Neurobiology of Affiliation. MIT Press, Cambridge, MA, pp. 155–68 Erskine M 1989 Solicitation behavior in the estrous female rat: A review. Hormones and Behaiour 23: 473–502 Fang J, Clemens L G 1999 Contextual determinants of female– female mounting in laboratory rats. Animal Behaiour 57: 545–55 Haug M R, Whalen R E, Aron C, Olsen K L (eds.) 1993 The Deelopment of Sex Differences and Similarities in Behaior. Kluwer, Boston, MA Hines M 1993 Hormonal and neural correlates of sex-typed behavioral development in human beings. In: Haug M, Whalen R E, Aron C, Olsen K L (eds.) The Deelopment of Sex Differences and Similarities in Behaior. Kluwer, Boston, MA, pp. 131–50
Keverne E G, Nevison C M, Martel F L 1999 Early learning and the social bond. In: Carter C S, Lederhendler I I, Kirkpatric B (eds.) The Integretie Neurobiology of Affiliation, MIT Press, Cambridge, MA, pp. 263–74 McGinnis M Y, Williams G W, Lumia A R 1996 Inhibition of male sex behavior by androgen receptor blockade in preoptic area or hypothalamus, but not amygdala or septum. Physiology & Behaior 60: 783–89 Nelson R J 2000 An Introduction to Behaioral Endocrinology, 2nd edn. Sinauer Associates, Sunderland, MA Numan M 1994 Maternal behavior. In: Knobil E, Neil J D (eds.) The Physiology of Reproduction, 2nd edn. Raven Press, New York, Vol. 2. pp. 221–302 Pfaff D W, Schwartz-Giblin S, McCarthy M M, Kow L-M 1994 Cellular and molecular mechanisms of female reproductive behaviors. In: Knobil E, Neil J D (eds.) The Physiology of Reproduction, 2nd edn. Raven Press, New York, Vol. 2. pp. 107–220 Pfaus J G 1996 Homologies of animals and human sexual behaviors. Hormones and Behaior 30: 187–200 Stern J M 1987 Pubertal decline in maternal responsiveness in Long-Evans rats: Maturational influences. Physiology & Behaior 41: 93–99 Ulibarri C, Yahr P 1996 Effects of androgen and estrogens on sexual differentiation of sex behavior, scent marking, and the sexually dimorphic area of the gerbil hypothalamus. Hormones and Behaiour 30: 107–30 VomSaal F S 1989 Sexual differentiation in litter-bearing mammals: Influence of sex of adjacent fetuses in utero. Journal of Animal Science 67: 1824–40 Wade G N, Schneider J E 1992 Metabolic fuels and reproduction in female mammals. Neuroscience and Biobehaioral Reiews 16: 235–72 Wallen K 1990 Desire and and ability: Hormones and the regulation of female sexual behavior. Neuroscience and Biobehaioral Reiews 14: 233–41 Wallen K 2000 Risky business: Social context and hormonal modulation of primate sexual desire. In: Wallen K, Schneider J E (eds.) Reproduction in Context, MIT Press, Cambridge, MA, pp. 289–323 Wallen K, Schneider J E (eds.) 2000 Reproduction in Context. MIT Press, Cambridge, MA Wersinger S R, Baum M J, Erskine M S 1993 Mating-induced FOS-like immunoreactivity in the rat forebrain: A sex comparison and a dimorphic effect of pelvic nerve transection. Journal of Neuroendocrinology 5: 557–68 Yamanouchi K, Matsumoto A, Arai Y 1985 Neural and hormonal control of lordosis behavior in the rat. Zoological Sciences 2: 617–27
A. A. Nunez and L. Clemens
Sexual Behavior: Sociological Perspective This article considers social science research in the field of human sexual behavior since the start of the nineteenth century. Sexual behavior is understood here in a broad sense to include not just sexual acts but 13977
Sexual Behaior: Sociological Perspectie also the associated verbal interactions and emotions (most notably love), as well as sexual desires, fantasies, and dysfunctions.
1. Oeriew The main disciplines considered here are the sociology and anthropology of sexuality, the psychology and psychopathology of sexual behavior, and sexology. These disciplines began to take form in the nineteenth century, and were influenced by other intellectual currents and areas of knowledge which, though not discussed in this article, need to be indicated: (a) Eugenics, in the form given it by Francis Galton from the 1860s. Eugenicist preoccupations were shared by most of the leading sexologists of the late nineteenth and early twentieth centuries (notably Havelock Ellis and Auguste Forel). (b) The history of sexuality. This developed in the nineteenth century, initially as the history of erotic art and practices, and of prostitution; and later as the history of sexuality in the ancient world and in other cultural contexts. (c) The ethology of sexuality. This emerged at the end of the nineteenth century, and has grown in influence since the 1960s, related partly to the development of sociobiology. Three phases can be identified in the development of social science research on sexuality: the nineteenth century (when the emphasis was on the study of prostitution and the psychopathology of sexual behavior); the period 1900–45 (that of the great sexological syntheses, the pioneering anthropological monographs and the first sex surveys); and the period beginning in 1946 (marked in particular by an expansion of quantitative research on the general population).
2. 1830–99: From the ‘Pathological’ to the ‘Normal’ It would be inaccurate to portray the nineteenth century in the industrialized countries as uniformly puritanical. The period was, of course, characterized by a widespread double standard in sexual morality (much less restrictive for young men than for young women), repression of masturbation, hypocrisy over the expression of love and sexual desires, censorship of literature and erotic art, and so on. But the nineteenth century also saw the development of feminism, the struggle for the civil rights of homosexuals, and the introduction of contraceptive methods in many countries. It was in the nineteenth century also that sexual behavior emerged as a major subject of scientific study. 13978
A characteristic of much work on sexuality in this period is how a concentration on the ‘pathological’ and ‘deviant’ was used to cast new light on the ‘normal’; that is, on the behavior most widespread in the population. For example, the quantitative study of prostitution preceded that of sexuality in marriage; the ‘perversions’ (referred to today as ‘paraphilias’) were examined before heterosexual intercourse between married couples; and the first scientific description of the orgasm (in 1855 by the French physician, Fe! lix Roubaud) actually appeared in a study on impotence. The first quantitative research on sexual behavior was conducted in the 1830s, much of it using the questionnaire technique being developed at this time. The first major empirical study in this field, based on quantification and combining sociological and psychological perspectives, was the investigation of prostitution in Paris conducted by the physician Alexandre Parent-Ducha# telet (1836). The second main current of research in the nineteenth century was concerned with the psychopathology of sexuality. The years 1886 and 1887 were a decisive period. In 1886 was published the first edition of Richard von Krafft-Ebing’s Psychopathia sexualis which presented a systematic classification of ‘sexual perversions.’ In 1887, the psychologist Alfred Binet (Binet 2001) published an article with the title ‘Le fe! tichisme dans l’amour’ (‘Erotic fetishism’). Binet’s text was the origin of an intellectual fashion for labeling the various ‘sexual perversions’ as ‘isms’ (the psychiatrist Charles Lase' gue had coined the term ‘exhibitionists’ in 1877 but not the word ‘exhibitionism’). By ‘fetishism,’ Binet referred to the fact of being particularly—or indeed exclusively—sexually excited by one part of the body or aspect of character, or by objects invested with a sexual significance (for example, underwear or shoes). He argued that the fetishism of any given individual was usually formed in childhood or adolescence, through a psychological process of association, during or after an experience that stirred the first strong sexual feelings. For Binet, many ‘sexual perversions’ as well as homosexuality should be considered as different forms of ‘erotic fetishism.’ Finally, he asserted that ‘pathological’ fetishism (such as obsessional fetishism for certain objects) was merely an ‘exaggerated’ form of the fetishism characteristic of ‘normal’ love (that of the majority of people). For a time this notion provided the unifying perspective for the psychopathology of sexuality, beginning with that elaborated by KrafftEbing in successive editions of his Psychopathia sexualis. The years that followed saw the generalization of the terms ‘sadism’ and ‘masochism’ (popularized by Krafft-Ebing), ‘narcissism’ (invented by Havelock Ellis and Paul Na$ cke in 1898–99), ‘transvestism’ (introduced by Magnus Hirschfeld around 1910). The psycho-analysis of Sigmund Freud integrated these various expressions and, most importantly, Binet’s
Sexual Behaior: Sociological Perspectie ideas about the lasting influence of childhood sexual impressions. This period also saw the development— encouraged by Alfred Binet and Pierre Janet—of the analysis of the sexual content in ordinary daydreams. The first questionnaire-based surveys of what in the twentieth century came to be referred to as sexual fantasies were carried out in the United States in the 1890s, in the context of research on adolescence led by G. S. Hall.
3. 1900–45: Large-scale Sexological Syntheses, Pioneering Anthropological Monographs and the First Sex Sureys The large volume of research conducted between 1900 and the end of World War II can be divided into three main currents (for a general view of the most significant contributions from this period, see E. Westermarck (1936). The first current is that of sexological research. It was in this period that sexology acquired an institutional status. The first sexological societies were set up in Germany in the years after 1910, and in 1914 Albert Eulenburg and Iwan Bloch founded the period’s most important journal of sexology (the Zeitschrift fuW r Sexualwissenchaft). The first Institute for Sexual Science was opened by Magnus Hirschfeld in Berlin in 1919. In the 1920s the first international conferences of sex research were held. This period also saw publication of large-scale works of synthesis, in particular those by Auguste Forel (Swiss), Albert Moll, Hermann Rohleder, Magnus Hirschfeld (German), followed later by Rene! Guyon (French) and Gregorio Maran4 on (Spanish). In the 1920s and 1930s, sexology became more self-consciously ‘political.’ A stated aim was to advance the ‘sexual liberation’ of young people and women, a cause advocated in influential books by B. Lindsey and W. Evans, Bertrand Russell, and Wilhelm Reich. Also published in these years were a number of extremely successful works popularizing sexological questions (the best known being those of the Englishwoman, Marie Stopes, and of the Dutch gynecologist, Th. H. Van de Velde). The aim of these manuals was to promote an enjoyment of marital sex, and the emphasis was accordingly on sexual harmony, orgasm, sexual dysfunctions, and no longer—as had been the case at the end of the nineteenth century—on ‘sexual perversions.’ The most representative and influential expression of the various tendencies in sexology at this time were the seven volumes of Studies in the Psychology of Sex (1900–28) by the Englishman, Havelock Ellis. The second current of research is in sexual anthropology. Broadly speaking these were either comparative studies of marriage and sexual life (E. Crawley, W. I. Thomas, W. G. Sumner, A. Van Gennep, R.
Briffault, K. Wikman and, foremost, E. Westermarck) or anthropological monographs, in particular those by B. Malinowski, M. Mead and G. Gorer. The most important of these monographs is that of Bronislaw Malinowski (1929) on the natives of the Trobriand Islands in New Guinea: topics examined include prenuptial sexuality, marriage and divorce, procreation, orgiastic festivals, erotic attraction, sexual practices, orgasm, the magic of love and beauty, erotic dreams and fantasies, as well as the morals of sex (decency and decorum, sexual aberrations, sexual taboos). The third and final current from this period is quantitative studies of sex behavior. Before 1914 this research was conducted mainly in Russia, Germany and Scandinavia. Between the wars, it developed primarily in the United States (R. Pearl, G. V. Hamilton, K. B. Davis, R. L. Dickinson, and L. Beam). These works prepared the way for and in many respects prefigured the research conducted by Kinsey and his co-workers from 1938.
4. Post-1946: Empirical Research in the Age of Sexual Liberalization In many countries, the second half of the twentieth century was a period of sexual liberalization. The improved status of women was reflected in a greater recognition of their rights in sexual matters (with implications for partner choice, use of contraception and abortion, as well as sexual pleasure). One consequence of this change was to encourage research into contraception: the contraceptive pill became available from 1960, and sterilization for contraceptive purposes was the most widely used means of birth control in the world by the end of the 1970s. Research was also encouraged into the physiology of the orgasm, and particularly the female orgasm, in the 1950s and 1960s (E. Gra$ fenberg, A. M. Kegel, A. C. Kinsey, W. H. Masters and V. E. Johnson, etc.). Another important factor of change was the arrival at adolescence in the 1960s of the postwar baby boom generation; economic affluence was the context for their demands for greater sexual freedom. This aspiration was reflected in a fall in age of first intercourse, especially for young women, and, related to this, a decline in the norm of female virginity at first marriage (or formation of first stable union). This liberalization reached its peak in the developed countries at the end of the 1970s and was brought to an abrupt halt by the AIDS epidemic, awareness of which began to develop, first in the United States, from 1981. In the course of the last fifty years of the twentieth century, the social sciences have made a major contribution to the understanding of human sexuality. Special mention must be made of the contributions from historical demography, history (R. Van Gulik, K. J. Dover, M. Foucault, P. Brown, and others), 13979
Sexual Behaior: Sociological Perspectie ethnology (V. Elwin, G. P. Murdock, C. S. Ford and F. A. Beach), the psychology of sexuality (see Eysenck and Wilson 1979), but also research originating in gay and lesbian studies conducted from a perspective of ‘social constructionism’ (sexuality is not a biological given but is socially constructed), as well as research on sexual identity, transsexualism, pornography and fantasies (for a general overview of the research mentioned above see: Arie' s and Be! jin (eds.) 1982, Allgeier and Allgeier 1988, McLaren 1999). Many of the new insights into sexual behavior acquired in this period have come from quantitative-based empirical research. The most influential of this research in the 1940s and 1950s was that directed in the United States by Alfred Kinsey. Between 1938 and 1954, Kinsey and his coresearchers interviewed more than 16,000 volunteers. While the personal information they collected was probably reliable, the sample constructed by Kinsey’s team was not representative of the US adolescent and adult population. Kinsey et al. (1948, 1953) distinguished the following ‘sources of sexual outlet’: masturbation, nocturnal emissions (or sex dreams), premarital heterosexual petting, premarital coitus (or intercourse), marital coitus, extramarital coitus, intercourse with prostitutes, homosexual responses and contacts, animal contacts. They established that the sexual history of each individual represents a unique combination of these sources of outlet and showed that between individuals there could be wide variation in ‘total sexual outlet’ (the sum of the orgasms derived from the various sources of sexual outlet). They also identified a number of sociological patterns. For example, compared with less educated people, better educated men and women had first heterosexual intercourse later, but had greater acceptance and experience of masturbation, heterosexual petting, foreplay and orogenital sexual practices. Also, according to these researchers, people with a premarital petting experience were more likely to have a stable marriage. In these two volumes there were curious omissions, the most striking being the almost total neglect of the emotions, notably love. And the interpretations given by the authors were sometimes debatable, such as the presentation of premature ejaculation as almost ‘normal’ because it happened to be widespread in the United States, or in considering women’s erotic imagination to be much less developed than men’s on the grounds that it seemed less responsive to sexually explicit images. It was in large part because of these two volumes that sexology between the 1950s and the end of the 1970s often resembled little more than what has been described as ‘orgasmology’ (see Arie' s and Be! jin (eds.) 1982, pp. 183 et seq.). However, they do represent an important stage in the development of the sociology of sexuality. A large number of quantitative surveys on sexuality were conducted in the 1960s and 1970s, influenced in 13980
part by Kinsey’s research, but which gave much greater attention to the sexual attitudes, personality, family background, feelings and even fantasies of the people being interviewed. It has also to be noted that this research was increasingly based on representative samples. These surveys were conducted on adult populations (Sweden, England, France, Finland, United States) but also on adolescents (England, United States, Denmark), young students and workers (West Germany) and homosexuals (United States, West Germany, in particular). They were conducted in a climate of increased politicization of sexuality that recalled the 1920s and often had utopian aspirations. The adoption of alternative sexual lifestyles (exemplified by the ‘communes’) was advocated by some; others celebrated the revolutionary potential of the (chiefly clitoral) orgasm. But this climate changed rapidly in the 1980s with the emergence of AIDS. This epidemic, for which no vaccine existed, demonstrated the need for up-to-date empirical data on sexual behavior as a basis for encouraging prevention behaviors by the population (more careful partner selection, and the use of HIV testing and condoms, etc.). In response, large-scale surveys of the general population, often using probability samples, were carried out in the 1990s, in Europe and the United States (see Wellings et al. 1994, Laumann et al. 1994, Kontula and Haavio-Mannila 1995, Be! jin 2001) and in the developing world (Cleland and Ferry (eds.) 1995). The sex surveys of the 1990s cannot be summarized here, but a number of their shared characteristics can be identified. They are based either on interviews or questionnaires (either face-to-face, or self-administered, or by telephone), and have been facilitated by the unquestionably greater willingness in recent decades to talk about sex, as reflected in better participation and response rates and fewer abandons. The theories advanced to interpret the data collected are often, though not always, those which Laumann et al. (1994, pp. 5–24) refer to as ‘scripting theory,’ ‘choice theory’ and ‘social network theory.’ The first postulates that, because of their exposure to an acculturation process, individuals usually follow ‘sexual scripts’ which prescribe with whom, when, where, how, and why they should have sex. The second places the emphasis on the costs (in time, money, emotional and physical energy, personal reputation, etc.) of sexual behavior. The third seeks to understand why some types of sexual relations occur between people with similar social characteristics whereas others (more unconventional) involve socially more contrasted individuals. One noteworthy finding from these surveys is that, compared with the developed countries, those of SubSaharan Africa are characterized by earlier occurrence of first heterosexual intercourse and a higher level of multiple partnership, but also by a substantially higher
Sexual Behaior: Sociological Perspectie percentage of people who had not had intercourse during the previous month. In other words, in the countries with a ‘young’ population structure, heterosexual activity tends to begin sooner but occurs less frequently and extends over a shorter period.
5. Conclusion: Future Directions To simplify, it can be said that in the nineteenth century the social sciences (including sexology) focused primarily on the forms of sexuality considered to be ‘deviant’ or ‘perverse.’ In addition, they gave priority to an exploration of behavior before beginning to study the psychological aspects (personality, sexual desires and fantasies). The same process occurred in the twentieth century, but on a broader scale, taking as subject the general population and thus an ‘average’ sexuality. Initially, between the 1920s and the end of the 1960s, the emphasis was on sexual practices (in particular, ‘sexual technique,’ and the orgasm). In a second period, however, especially since the start of the 1970s, attention has focused increasingly on sexual desire and fantasies. The fact, for example, that Viagra (the erection-enhancing drug first marketed in 1998) only works for men who feel attracted to their partners, illustrates the need for an understanding of the interior aspects of sexuality, in particular of the psychological blockages, desires and fantasies. This suggests that one trend in the future will be a growth of research on sexual fantasies and the complexities of sexual orientation and identity, and into the effects of pornography, cybersex and sexual addictions. A second probable trend in future research is the development of the comparative study of ‘national’ sexualities as revealed in the sex surveys of the 1990s. The data on sexuality that has been assembled needs to be subjected to analysis and interpretation. A comparative analysis of national sex surveys offers an excellent means of identifying the influence of culture on sexual attitudes and behavior as well as on desires and fantasies. It should be possible, for example, to compare the respective influence of cultures that are predominantly hedonistic or ascetic in orientation. Other factors to be assessed include religious beliefs and practices, population aging, democratization, and the relationship between the sexes. This material is potentially the basis for new syntheses in the sociology of sexuality, comparable in scope to the great sociosexological syntheses produced in the first thirty years or so of the twentieth century. A third trend in future research will be a continuing exploration of the themes developed since the 1970s whose common point is their focus on sexual behavior that is more or less coercive in character: paedophilia, sexual tourism, sexual violence, sexual harassment, and sexual mutilations, notably those inflicted on women.
A fourth line of research for the future follows from the growing numbers of elderly and old people in the population, who increasingly expect to remain sexually active. Their sexuality will surely be the subject of further and more detailed research. The fifth and final trend for the future concerns a partial renewal of social science research on sexuality from the recent developments in the ethology of sexuality. More generally, indeed, interdisciplinary perspectives can be expected to inform and enrich all areas of social science research in the field of sexual behavior. See also: Eugenics, History of; Family and Kinship, History of; Heterosexism and Homophobia; Sex Therapy, Clinical Psychology of; Sexual Attitudes and Behavior; Sociobiology: Overview
Bibliography Allgeier A R, Allgeier E R 1988 Sexual Interactions. Heath, Lexington, MA Arie' s P, Be! jin A (eds.) 1982 SexualiteT s occidentales. Seuil, Paris [1985 Western Sexuality. Basil Blackwell, Oxford, UK] Be! jin A 2001 Les fantasmes et la ie sexuelle des Francm ais. Payot, Paris Binet A 2001 Le feT tichisme dans l’amour [1st edn. 1887]. Payot, Paris Cleland J, Ferry B (eds.) 1995 Sexual Behaiour and AIDS in the Deeloping World. Taylor and Francis, London Ellis H 1900–28 Studies in the Psychology of Sex (7 Vols.). Davis, Philadelphia, PA Eysenck H J, Wilson G 1979 The Psychology of Sex. Dent, London Kinsey A C, Pomeroy W B, Martin C E 1948 Sexual Behaior in the Human Male. Saunders, Philadelphia, PA Kinsey A C, Pomeroy W B, Martin C E, Gebhard P H 1953 Sexual Behaior in the Human Female. Saunders, Philadelphia, PA Kontula O, Haavio-Mannila E 1995 Sexual Pleasures. Enhancement of Sex Life in Finland, 1971–1992. Dartmouth, Aldershot, UK Laumann E O, Gagnon J H, Michael R T, Michaels S 1994 The Social Organization of Sexuality. Sexual Practices in the United States. University of Chicago Press, Chicago, IL McLaren A 1999 Twentieth-century Sexuality: A History. Basil Blackwell, Oxford, UK Malinowski B 1929 The Sexual Life of Saages in North-Western Melanesia. Routledge, London Parent-Ducha# telet A 1836 De la prostitution dans la ille de Paris. Baillie' re, Paris Wellings K, Field J, Johnson A M, Wadsworth J 1994 Sexual Behaiour in Britain. Penguin Books, Harmondsworth, UK Westermarck E A 1936 The Future of Marriage in Western Ciilization. Macmillan, London
A. Be! jin Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
13981
ISBN: 0-08-043076-7
Sexual Harassment: Legal Perspecties
Sexual Harassment: Legal Perspectives After a remarkably swift development in law and popular consciousness, sexual harassment remains the subject of controversy and debate. The concept of sexual harassment is largely an American invention. Like many other concepts that have mobilized social action in the United States, this one emerged in the context of law reform. In the 1970s, US feminists succeeded in establishing sexual harassment as a form of sex discrimination prohibited by Title VII of the Civil Rights Act of 1964 (the major federal statute prohibiting discrimination in employment). Since then, the concept has taken hold in broader arenas such as organizational practice, social science research, media coverage, and everyday thought. Although Title VII remains the primary legal weapon against sexual harassment in the US, traditional anti-discrimination law squares uneasily with the overtly sexual definition of harassment that emanated from 1970s feminist activism and ideas. Just at the time when this narrow sexual definition of harassment has come under criticism in the US, however, that same definition has begun to spread to other nations—inviting inquiry around the globe. Today, sexual harassment’s conceptual boundaries are being rethought in the law, in the wider culture, and in feminist thought.
1. Definitions and Origins In the United States, harassment is predominantly defined in terms of unwanted sexual advances. In the eyes of the public and the law, the quintessential case of harassment involves a powerful male supervisor who makes sexual advances toward a female subordinate. Harassment is an abuse of sexuality; it connotes men using their workplace power to satisfy their sexual needs. This sexual model of harassment was forged in early Title VII law. Some of the earliest cases were brought by women who had been fired for refusing their bosses’ sexual advances. Lower courts at first rejected these claims, reasoning that the women had been fired because of their refusal to have affairs with their supervisors and not ‘because of [their] sex’ within the meaning of the law. The appellate courts reversed; they held employers responsible for the bosses’ conducts as a form of sex discrimination now called quid pro quo harassment. The results were a step forward: It was crucial for the courts to acknowledge that sexual advances can be used as a tool of sex discrimination. But the reasoning spelled trouble, because the courts’ logic equated the two. The courts said the harassment was based on sex under Title VII because the advances were driven by a sexual attraction that the male supervisor felt for a woman but would not have felt for a man. By locating 13982
the sex bias in the sexual attraction presumed to underline the supervisor’s advances, these decisions singled out (hetero) sexual desire as the sine qua non of harassment. Had the supervisor demoted the plaintiff or denigrated her intelligence, the court would have had far more difficulty concluding that the conduct was a form of sexism prescribed by law. Even at the time, there were broader frameworks for understanding women’s experiences as sex harassment. Carroll Brodsky’s book, The Harassed Worker (1976), for example, articulated a comprehensive nonsexual definition of harassment as ‘treatment that persistently provokes, pressures, frightens, intimidates, or otherwise discomforts another person.’ Rather than a form of sexual exploitation, Brodsky saw harassment as ‘a mechanism for achieving exclusion and protection of privilege in situations where there are no formal mechanisms available.’ (Brodsky 1976, p. 4). In his usage, ‘sexual harassment’ referred not simply to sexual advances, but to all uses of sexuality as a way of tormenting those who felt ‘discomfort about discussing sex or relating sexually.’ (Brodsky 1976, p. 28). A few Title VII decisions had already recognized sexual taunting and ridicule as a mechanism for male supervisors and co-workers to drive women away from higher-paying jobs and fields reserved for men. Indeed, the very concept of harassment as a form of discrimination in the terms and conditions of employment was first invented in race discrimination cases, where judges had discovered that employers could achieve racial segregation not only through formal employment decisions such as hiring and firing, but also through informal, everyday interactions that create an atmosphere of racial inferiority in which it is more difficult for people of color to work. Analogizing to the race cases, the courts might have likened bosses’ demands for sexual favors from women to other discriminatory supervisory demands—such as requiring black women to perform heavy cleaning that is not part of their job description in order to keep them in their place, or requiring women to wear gendered forms of dress or to perform stereotypically feminine duties not considered part of the job when men do it. Or judges might have located the sexism in a male boss’s exercise of the paternalist prerogative to punish as an employee someone, who dares to step out of her place, as a woman by refusing the boss sexual favors; sociological analysis reveals that male bosses have penalized female employees for other non-sexual infractions that represent gender insubordination rather than job incompetence (Crull 1987). But the courts relied instead on a sexualized framework put forward by some feminist lawyers and activists. Toward the mid-1970s, US cultural-radical feminists moved toward a simplistic view of heterosexuality as the lynchpin of women’s oppression (Willis 1992, p. 144). Given this ideological commitment, it is not surprising that these early feminists conceived of women’s workplace harassment as a
Sexual Harassment: Legal Perspecties form of unwanted sexual advances analogous to rape. Lin Farley’s book, Sexual Shakedown, defined harassment as ‘staring at, commenting upon, or touching a woman’s body; requests for acquiescence in sexual behavior; repeated nonreciprocated propositions for dates; demands for sexual intercourse; and rape.’ (Farley 1978, p. 15). A few years later, Catharine MacKinnon argued that harassment is discriminatory precisely because it is sexual in nature—and because heterosexual sexual relations are the primary mechanism through which male dominance and female subordination are maintained (MacKinnon 1979). In the US, less than a decade later, the consolidation of this narrow view of women’s workplace harassment was largely complete. The 1980 Equal Employment Opportunity Commission (EEOC) guidelines defined sex harassment as ‘unwelcome sexual advances, requests for sexual favors, and other verbal or physical conduct of a sexual nature’—a definition courts have read to require overtly sexual conduct for purposes of proving both quid pro quo harassment (which involves conditioning employment opportunities on submission to sexual advances) and hostile work environment harassment (which involves creating an intimidating or hostile work environment based on sex). Indeed, in hostile environment cases, the lower courts have tended to exonerate even serious sexist misconduct if it does not resemble a sexual advance (Schultz 1998). Media coverage has infused this view of harassment into popular culture. The 1991 Anita Hill-Clarence Thomas controversy helped solidify the view of harassment as sexual predation. Hill, at the time a novice lawyer in her mid-twenties, claimed that Thomas, her then-supervisor at the Department of Education and later Chair of the EEOC, had pressured her to go out with him and regaled her with lewd accounts of pornographic films and his own sexual prowess (Mayer and Abramson 1994). Soon afterward, the news media broke the story of Tailhook, in which drunken Navy pilots sexually assaulted scores of women at a raucous convention (Lancaster 1991). Later in the 1990s, public attention turned to the harassment lawsuit of a former Arkansas state employee, Paula Jones, who alleged that President Bill Clinton had made crude sexual advances towards her while he was the Governor of Arkansas. Organizations seeking to avoid legal liability for sexual harassment have also defined it in terms of overtly sexual conduct. Many employers have adopted sexual harassment policies, but Schultz’s research reveals that these policies are rarely if ever integrated into broader anti-discrimination programs. Indeed, some employers’ efforts to avert sexual harassment may undermine their own efforts to integrate women fully into the workplace, as firms adopt segregationist strategies designed to limit sexual contact between men and women (such as prohibiting men and women from traveling together). Such policies reinforce perceptions of women as sexual objects and deprive them
of the equal training and opportunity the law was meant to guarantee.
2. Current Challenges In recent years, the prevailing understanding of sexual harassment has come under challenge in the US. Civil libertarians have voiced concern that imposing vicarious liability on employers for their employees’ sexual harassment gives employers a powerful incentive to curb workers’ freedom of speech and sexual expression in the workplace. Critics say harassment law incorporates vague standards—including the requirement that the hostile work environment harassment be ‘sufficiently severe and pervasive to alter the conditions of the victim’s employment and create an abusive working environment’—(Meritor Saings Bank s. Vinson, p. 67)—that permit employers to adopt broad policies that chill sexual speech (Strossen 1995, Volokh 1992). Many employers may not limit their policies to ‘unwelcome’ conduct, because they will want to avoid costly, contentious inquiries into whether particular sexual interactions were unwelcome. Because few employers have a stake in protecting their employees’ freedom of expression (and only government employers have any First Amendment obligation to do so), many firms may simply adopt broad, across-the-board proscriptions on sexual activity and talk on the part of employees (Rosen 1998, Hager 1998). Although there has been no systematic research in this area, some alarming incidents have been reported. (Schuldt 1997), (Grimsley 1996, p. A1). Such concerns have resonated with a new generation of feminist legal scholars, who have begun to worry about the extent to which equating workplace sexual interaction with sex discrimination replicates neoVictorian stereotypes of women’s sexual sensibilities (Abrams 1998, Franke 1997, Strossen 1995). While civil libertarians have urged repealing or restricting the range of employer liability under Title VII—often, proposing instead to hold individual harassers responsible for their own sexual misconduct under common law (Hager 1998, Rosen 1998), younger feminist scholars have focused on reforming sex harassment law to bring it in line with traditional antidiscrimination goals. Kathryn Abrams defines harassment as sex discrimination not because (hetero) sexual relations inherently subordinate women, but because she believes male workers use sexual advances to preserve masculine control and norms in the workplace; she would limit liability to cases in which the harasser manifestly disregards the victim’s objections or ambivalence toward his advances (Abrams 1998). Katherine Franke argues that harassment is a ‘technology of sexism’ through which men police the boundaries of gender; she would focus on whether men have used sexuality to press women or other men 13983
Sexual Harassment: Legal Perspecties into conventional ‘feminine’ or ‘masculine’ roles (Franke 1997). Both Abrams and Franke attempt to break with the old equation of sexuality and sexism. Yet neither makes a decisive break, for each retains the idea that (hetero) sexual objectification is the key producer of gender (for Abrams, of gender subordination in the workplace, for Franke, of gender performance throughout social life). Bolder analyses seek to jettison altogether sexual harassment’s conceptual underpinnings in sexuality. Janet Halley’s queer theory-based critique emphasizes both the cultural\psychic dangers of outlawing the expression of sexuality and the heavier-handed repression such an approach places on sexual minorities. To the extent that harassment law focuses on whether sexual conduct is offensive to a reasonable person, judges and juries will rely on their ‘common sense’ to evaluate the advances—and other actions—of gays, lesbians, bisexuals, and other sexual dissidents as inherently more offensive than those of heterosexuals (Halley 2000). As Kenji Yoshino has observed, the courts have conditioned liability in same-sex sexual harassment cases on the harasser’s sexual orientation. Conduct that courts consider an unwanted sexual advance when the harasser is homosexual is deemed innocuous horseplay when the harasser is heterosexual—an approach that sets up a two-tiered system of justice that has nothing to do with the victim’s injury (Yoshino 2000). (See Sexual Orientation and the Law.) For related reasons, Schultz has argued that sex harassment law should abandon its emphasis on sexual misconduct and focus on gender-based exclusion from work’s privileges. She argues for reconceptualizing harassment as a means of preserving the masculine composition and character of highly-valued types of work and work competence (Schultz 1998). In Schultz’s view, the prevailing sexual understanding of harassment is too narrow, because it neglects more common, non-sexual forms of gender-based mistreatment and discrimination that keep women in their place and prevent them from occupying the same heights of pay, prestige, and authority as men. Indeed, Schultz contends, the centrality of occupational identity to mainstream manhood leads some men to harass other men they regard as unsuitably masculine for the job. At the same time, sexual model risks repression of workers’ sexual talk or interaction—even where they do not threaten gender equality on the job. Schultz’s call to move away from a sexuality-based harassment jurisprudence builds on the earlier work of Regina Austin, who recognized in 1988 that most workplace harassment was not ‘sexual’ in nature and proposed a tort of worker abuse to protect employees from classbased mistreatment at the hands of their bosses (Austin 1988). Although Austin was concerned with structures of class and race, recently a more individual dignitary approach has been revived by Anita Bernstein and by Rosa Ehrenreich, who propose protecting 13984
employees’ rights to equal respect through antidiscrimination and tort law, respectively (Bernstein, 1997, Ehrenreich 1999). Recognizing harassment as just another form of discriminatory treatment restores Title VII’s protections to those who allege discriminatory abuse based on characteristics other than gender or even race, such as religion and disability (Goldsmith 1999).
3. Future Directions In the United States, the sexual model of harassment always rested uneasily alongside traditional employment discrimination law—which is concerned with work, not sexuality. Although the future is far from certain, the US Supreme Court’s recent sexual harassment decisions seem poised to restore harassment law to its traditional focus. The Court’s 1998 decisions in Ellerth s. Burlington Industries and Faragher s. Boca Raton held that a company’s vicarious liability for a supervisor’s harassment turns on whether the harassment involves a ‘tangible employment action’—such as hiring, firing, or promotion—not on the content of the misconduct or its characterizations as quid pro quo or hostile environment harassment. In the absence of such a tangible action, companies can avoid liability by proving that they adequately corrected harassment a victim reported (or reasonably should have reported) through acceptable in-house channels. By creating a loophole for companies who investigate harassment through their own procedures, the Court sought to counter any current incentives for managers to ban sexual interactions across the board. By adhering to vicarious liability where harassment involves the same tangible employment decisions as more traditional forms of discriminatory treatment, the Court acknowledged that harassment is simply a form of employment discrimination subject to the usual legal rules (White 1999). The Court’s decision in Oncale s. Sundowner Offshore Serices further reconciles sexual harassment with the traditional discrimination approach. In Oncale, the Court held that male-on-male harassment is actionable under Title VII, taking pains to emphasize that harassment is not to be equated with conduct that is sexual in content or design. ‘We have never held that workplace harassment … is … discrimination because of sex merely because the words used have sexual content or connotations,’ said the Court (Oncale s. Sundowner Offshore Serices, Inc., p. 80). By the same token, ‘harassing conduct need not be motivated by sexual desire to support an inference of discrimination on the basis of sex.’ (Oncale s. Sundowner Offshore Serices, Inc., p. 80). ‘The critical issue,’ stressed the Court, ‘is whether members of one sex are exposed to disadvantageous terms or conditions of employment to which members of the other
Sexual Harassment: Legal Perspecties sex are not exposed, (Oncale s. Sundowner Offshore Serices, Inc., p. 80). The social sciences are also beginning to look beyond the sexual model, providing evidence of the need to conceptualize sex harassment in broader terms. In the early years, most workers attempted to document the prevalence of harassment, albeit with limited results due to their implicit reliance on a conceptually narrow (yet often vague) notion of harassment specified by lists of sexual behaviors derived from the EEOC guidelines (Welsh 1999). Over time, the research has become more multi-faceted, as a burgeoning scholarship has focused greater attention on defining harassment, theorizing its causes, documenting its consequences, and developing predictors. Yet much research remains wedded to the sexual view. Prominent theories of harassment posit that men’s tendency to stereotype women as sexual objects is ‘primed’ by the presence of sexual materials or behavior to induce a ‘sex-role spillover’ effect that leads men to sexualize women inappropriately in the workplace (Fiske 1993, Gutek 1992). Predictive models search for characteristics that predispose men to harassment, such as a propensity to sexualize women over whom they have some supervisory authority (Pryor et al. 1995). Further examples abound (see Welsh 1999, Borgida and Fiske 1995). Nonetheless, recent research has opened up a broader horizon. A few studies have included measures of gender-based harassment that is not necessarily sexual in content or design: The results suggest that such harassment is more widespread than overtly sexual forms (Frank et al. 1998, Fitzgerald et al. 1988). Social psychologists have begun to look beyond sexual objectification to explore how other gender-based stereotypes and motives can combine with occupational identities and institutional contexts to produce a variety of forms of workplace harassment and discrimination that are not all motivated by sexual attraction (Fiske and Glick 1995). Even some economists have moved away from the conventional view of harassment as sexual coercion that is unrelated to the internal dynamics of labor markets (Posner 1989) to develop innovative theories to explain how incumbent workers can obtain the power to exclude and disadvantage aspiring entrants (Lindbeck and Snower 1988). These developments align social psychological and economic theories more closely with those of sociologists, who have long emphasized harassment’s connection to structural features of organizations such as gross numerical imbalance, standardless selection processes, and the absence of managerial accountability (Reskin 2000, Schultz 1991). In tension with such efforts to transcend a narrow definition of sex harassment in the United States are developments in some other parts of the globe, where the sexual model championed by early US culturalradical feminists seems to be gaining headway. According to some commentators, the American under-
standing of sex harassment has been disseminated abroad so successfully that it now forms the foundation for international debates on sex harassment (Cahill 2000). Following closely on the heels of legal developments in the USA, for example, the European Union took steps to condemn and outlaw workplace sex harassment as a violation of women’s dignity, and it defined this concept in terms strikingly similar to the definition promulgated by the US EEOC. Encouraged by EU initiatives, feminists in Europe have drawn on particular features of the US model to promote versions of sex harassment law that resonate with their own traditions. In Austria, Cahill shows, feminists capitalized on desires to signal Austria’s compliance with the economic First World’s laws to press for a harassment law that adopts both the sexualized substantive definition and the privatized enforcement mechanisms of the US approach. The importation of these features allows critics to reject not only the law but also the very existence of harassment as a nonindigenous, imperialist export (Cahill 2000). In France, as Saguy shows, French feminists won a law that criminalizes the use of ‘orders, threats, constraint or serious pressure in the goal of obtaining sexual favors, by someone abusing the authority conferred by his position’ (Saguy 2000, p. 19 and note 34). This approach incorporates the American cultural-radical feminist view of harassment as a form of sexual abuse yet simultaneously signals distance from what the French perceive to be US sexual prudishness by highlighting that harassment is an abuse of hierarchical authority—an idea that conforms to conventionalFrenchviewsofhierarchicalpowerasinimical to equality. Neither French feminists nor lawmakers connect harassment to a larger system of workplace gender inequality that relegates women to inferior jobs; the gender segregation of work is accepted as a ‘neutral’ background condition rather than challenged as the structural context of inequality in which sex harassment flourishes and which it fosters. Friedman’s work suggests that even in Germany, where there is a tradition of using law to promote workers’ empowerment, transplanting the US model of harassmentas-sexual overtures serve to exacerbate what German feminists have decried as a conservative judicial tendency to define sex harassment as a violation of female sexual honor which requires moral purity (Friedman 2000). It is perhaps ironic that just as German (and other) feminists are trying to highlight the specificity of sexual harassment, a new generation of US feminists are striving to incorporate sexual harassment into broader frameworks for understanding the dynamics of ingroup\outgroup exclusion among different groups of workers—a project that might well be assisted by expanding the German concept of mobbing as a pervasive pattern of workplace harassment that targets an individual for exclusion and abuse at the hands of coworkers and supervisors’ (Friedman 2000, 13985
Sexual Harassment: Legal Perspecties p. 6). What is needed is a structural analysis of how a variety of different forms of harassment—including sexual advances, gender-based and other forms of taunting, physical threats, verbal denigration, work sabotage, heightened surveillance, and social isolation—can be used by socially dominant groups to code and to claim scarce social resources (such as good jobs and a sense of entitlement to them) to the exclusion of others. In this project, Americans have as much to learn from other countries’ experiences as vice versa. It is cause for optimism that so many scholars, activists and policymakers are struggling toward such a broader—and deeper—understanding of harassment. Inspired by new research, new feminisms, and newer social movements (such as queer theory) that are calling into question the old top-down, malefemale sexual model, the law of sex harassment awaits being overhauled to fit the world of the twenty-first century. See also: Autonomy at Work; Gender and the Law; Gender Differences in Personality and Social Behavior; Heterosexism and Homophobia; Lesbians: Historical Perspectives; Lesbians: Social and Economic Situation; Male Dominance; Organizations: Authority and Power; Psychological Climate in the Work Setting; Rape and Sexual Coercion; Regulation: Sexual Behavior; Sex Differences in Pay; Sex Segregation at Work; Sexual Attitudes and Behavior; Sexual Behavior: Sociological Perspective; Sexual Harassment: Social and Psychological Issues; Sexuality and Gender; Workplace Environmental Psychology
Bibliography Abrams K 1998 The new jurisprudence of sexual harassment. Cornell Law Reiew 83: 1169–1230 Arvey Richard D, Cavanaugh M A 1995 Using surveys to assess the prevalence of sexual harassment: Some methodological problems. Journal of Social Issues 51(1): 39–50 Austin R 1988 Employer abuse, worker resistance, and the tort of intentional infliction of emotional distress. Stanford Law Reiew 41: 1–59 Bernstein A 1997 Treating sexual harassment with respect. Harard Law Reiew 111: 45–527 Borgida E, Fiske S T (eds.) 1995 Special issue: gender stereotyping, sexual harassment, and the law. Journal of Social Issues 51(1): 1–207 Brodsky C 1976 The Harassed Worker. Lexington Books, Lexington, MA Burlington Industries s. Ellerth. 1998. 524 US 742 Cahill M 2000 The legal problem of sexual harassment and its international diffusion: A case study of Austrian sexual harassment law. (unpublished manuscript) Crull P 1987 Searching for the causes of sexual harassment: An examination of two prototypes. In: Bose C, Feldberg R, Sokoloff N (eds.) Hidden Aspects of Women’s Work. Praeger Press, New York, pp. 225–44
13986
Ehrenreich R 1999 Dignity and discrimination: Toward a pluralistic understanding of workplace harassment. Georgetown Law Journal 88: 1–64 Faragher s. City of Boca Raton. 1998. 524 US 775 Farley Lin 1978 Sexual Shakedown: The Sexual Harassment of Women on the Job. McGraw-Hill, New York Fiske T 1993 Controlling other people: The impact of power on stereotyping. American Psychologist 48: 621–8 Fiske S T, Glick P 1995 Ambivalence and stereotypes cause sexual harassment: A theory with implications for organizational change. Journal of Social Issues 51: 97–115 Fitzgerald L F, et al. 1988 The incidence and dimensions of sexual harassment in academia and the workplace. Journal of Vocational Behaior 32: 152–75 Frank E et al. 1998 Prevalence and correlates of harassment among US women physicians. Archies Internal Medicine 1998 158: 352–8 Franke K 1998 What’s wrong with sexual harassment? Stanford Law Reiew 49: 691–772 Friedman G 2000 Dignity at work: workplace harassment in Germany and the United States. (unpublished manuscript). Goldsmith E 1999 God’s House or the Law’s. Yale Law Journal 108: 1433–40 Grimsley K D 1996 In combating sexual harassment, companies sometimes overreact. Washington Post (Dec. 23): A1 Gutek B 1992 Understanding sexual harassment at work. Notre Dame Journal of Law, Ethics & Public Policy 6: 335–58 Hager M 1998 Harassment as a tort: Why title VII hostile environment liability should be curtailed. Connecticut Law Reiew 30: 375–439 Halley J 2000 Sexuality harassment. (unpublished manuscript). Kanter R M 1997 Men and Women of the Corporation. Basic Books, New York Lancaster J 1991 Navy ‘gauntlet’ probed: Sex harassment alleged at fliers’ convention. Washington Post (Oct. 30): A1 Lindbeck A, Snower D 1988 The Insider-Outsider Theory of Employment and Unemployment. MIT Press, Cambridge, MA MacKinnon C 1979 The Sexual Harassment of Working Women. Yale University Press, New Haven, CT Mayer J, Abramson J 1994 Strange Justice: the Selling of Clarence Thomas. Houghton Mifflin, Boston Meritor Federal Saings Bank s. Vinson. 1986. 477 US 57. Oncale s. Sundowner Offshore Serices, Inc. 1998. 523 US 75. Posner R 1989 An economic analysis of sex discrimination laws. Uniersity of Chicago Law Reiew 56: 1311–35 Pryor J B et al. 1995 A social psychological model for predicting sexual harassment. Journal of Social Issues 51: 69–84 Reskin B 2000 The proximate causes of employment discrimination. Contemporary Sociology 29: 319–28 Rosen J 1998 In defense of gender blindness. The New Republic 218: 25–35 Saguy A C 2000 Sexual harassment in France and the United States: Activists and public figures defend their definitions. In: Lamont M, Thevenot L (eds.) Rethinking Comparatie Cultural Sociology: Polities and Repertoires of Ealuation in France and the United States. Cambridge University Press, Cambridge, UK, pp. 56–93 Schuldt Gretchen 1997 Ex-Miller exec. copied page with anatomical word: man suing over firing says he showed copy of female co-worker. Milwaukee Sentinel (June 26), 1 Schultz V 1991 Telling stories about women and work: Judicial interpretations of sex segregation in Title VII Cases raising the lack of interest argument. Harard Law Reiew 103: 1749–843 Schultz V 1998 Reconceptualizing sexual harassment. Yale Law Journal 107: 1683–1804
Sexual Harassment: Social and Psychological Issues Strossen N 1995 Defending Pornography: Free Speech, Sex, and the Fight for Women’s Rights. Anchor Books, New York Volokh E 1992 Freedom of speech and workplace harassment. UCLA Law Reiew 39: 1791–872 Welsh S 1999 Gender and sexual harassment. Annual Reiew of Sociology 25: 169–90 White R H 1999 There’s nothing special about sex: The supreme court mainstreams sexual harassment. William & Mary Bill of Rights Journal 7: 725–53 Willis E 1992 Radical feminism and feminist radicalism. In: Wills E (ed.) No More Nice Girls: Countercultural Essays, Wesleyan University Press, London, 117–50 Yoshino K 2000 The epistemic contract of bisexual erasure. Stanford Law Reiew 52: 353–461
V. Schultz and E. Goldsmith
Sexual Harassment: Social and Psychological Issues Sexual harassment is generally defined as unwelcome sexual advances, requests for sexual favors, or other verbal or physical conduct of a sexual nature that is either a condition of work or is severe and pervasive enough to interfere with work performance or to create a hostile, intimidating work environment. It may consist of words, gestures, touching, or the presence of sexual material in the work environment. It typically involves a pattern of behavior over a period of time, rather than a single event. We might think about the former as ‘an episode of sexual harassment.’ In perhaps 90 percent of the episodes, women are the recipients and men are the initiators, but both sexes can harass and both can be harassed by the same sex or the other sex. If sexual harassment meets certain criteria (e.g., unwelcome, severe, and pervasive), it is illegal in many countries, but not all behavior commonly considered sexual harassment violates the law.
1. A Short History of Research on Sexual Harassment It is no doubt safe to assume that sexual harassment has been around for a long time, but it has been labeled, studied, and legislated for only about 20 years. In 1978, journalist, Lin Farley wrote Sexual Shakedown to bring attention to the phenomenon. In 1979, legal scholar, Catharine MacKinnon wrote an influential book that would provide a legal framework for dealing with sexual harassment in the US. MacKinnon argued that sexual harassment was a form of sex discrimination (i.e., denies women equal opportunity in the workplace) and therefore Title VII of the 1964 Civil Rights Act, which forbids discrimination on the basis of sex (among other social
categories), should apply. A year after her book was published, the US Equal Employment Opportunity Commission established guidelines on sexual harassment. Early empirical studies of sexual harassment in the workplace and academia started appearing in print about the same time. By 1982, at least one journal (Journal of Social Issues) had produced a whole issue devoted to scholarship on the topic. Today sexual harassment is studied by scholars in many countries who work in many fields, including law, psychology (clinical, forensic, organizational, social), sociology, management and human resources, history, anthropology, communication, and the humanities. Within the social sciences, sexual harassment is studied through quantitative techniques that focus on the measurement of constructs and determination of base rate statistics and qualitative case-study techniques focusing on specific occupations such as wait staff and female coal miners.
2. Measurement of Sexual Harassment In the early 1980s, researchers frequently used the term ‘social-sexual behaviors’ to distinguish a set of behaviors that might constitute sexual harassment from a legally defined measure of sexual harassment. These sets typically included behaviors unlikely to be considered sexual harassment either under the law or by a majority of the population. By including a broad range of behaviors, researchers could learn whether people’s views of specific behaviors differ over time (or across samples). It would also allow researchers to see if legal and illegal social-sexual behaviors have common antecedents and consequences. More recently, however, sexual harassment is what researchers say they are measuring. This has caused some confusion because many people seem to interpret statistics on sexual harassment to indicate the percentage of the workforce that would have a strong legal claim of sexual harassment. This is not true. Researchers have not attempted to capture the legal definition of harassment because: (a) the legal definition changes as the law develops, so the legal definition is a moving target; (b) laws vary from country to country; (c) targets may experience negative consequences of sexual harassment without having the harassment rise to meet a legal definition; (d) there is no reason to believe that we can learn about sexual harassment only by measuring it to conform to its legal definition. Some scholars now make explicit the point that a lay definition of sexual harassment does not necessarily imply that a law has been broken. A global item, say, ‘Have you ever been sexually harassed,’ is rarely used to measure sexual harassment because some researchers contend that it results in an under-reporting of the phenomenon. Workers seem reluctant to acknowledge that they have been sexually 13987
Sexual Harassment: Social and Psychological Issues harassed. In addition, asking respondents if they have been sexually harassed places a great cognitive load on them, as they would have to determine first what constitutes sexual harassment and then determine if they had experienced any behavior that met those criteria. In many studies, a single question asking the respondent if she has been sexually harassed is used as an indicator of acknowledging or labeling sexual harassment rather than an indicator of sexual harassment, per se. Most studies measure sexual harassment by asking respondents if they have experienced any of a list of behaviors that might be considered sexual harassment. These measures are generally (but not always) designed to be suitable for both sexes. In some cases, respondents are asked whether they have experienced a list of behaviors that might be considered sexual harassment, and later in the survey asked which of those behaviors they consider sexual harassment, allowing the researcher to determine which of a broad range of behaviors respondents have experienced that they consider sexual harassment. Multi-item measures of sexual harassment are also based on a list of behaviors that people may have experienced. The best known of these measures is the Sexual Experiences Questionnaire (SEQ) developed by Louise Fitzgerald and her colleagues. The SEQ has experienced many changes; the number of questions asked, the wording of questions, and the wording of responses have all been modified, and its refinement is ongoing. The number of subscales that emerge from it has also changed over time so it is important to keep in mind that all the studies using the SEQ are not necessarily using the same set of questions, scored the same way, and at this point it cannot be used to assess changes over time or differences across studies.
3. The Prealence of Sexual Harassment A number of studies have relied on random sample surveys including studies of government workers, the military, specific geographical areas, and specific work organizations. Some of these have obtained quite high response rates, some studies have been conducted in Spanish as well as in English, and a few longitudinal analyses have been conducted. In measuring prevalence, there is some debate about the timeframe that should be considered. Researchers typically inquire about experiences within the past year, the prior two years, or throughout the person’s entire work life, depending on the purpose of the study. While critics have cautioned that retrospective measures will introduce inaccuracy or bias, researchers active in the field are less concerned. It is difficult to know if one is currently being sexually harassed and qualitative studies provide convincing examples of events initially not labeled sexual harassment that came to be so labeled at a later date. 13988
3.1 The Prealence of Sexual Harassment of Women The random-sample studies together suggest that from about 35 percent to 50 percent of women have been sexually harassed at some point in their working lives, where sexual harassment refers to behavior that most people consider sexual harassment. Estimates are higher among certain groups such as women who work in male-dominated occupations. The most commonly reported social-sexual behaviors are the less severe ones, involving sexist or sexual comments, undue attention or body language. Sexual coercion is, fortunately, much rarer, involving 1–3 percent of many samples of women. In contrast, a study of allegations in cases tried in court showed a much higher incidence of severe behaviors, with 22 percent involving physical assault, 58 percent nonviolent physical contact, and 18 percent violent physical contact. The incidence of sexual harassment of women appears to be rather stable. Three US Merit Systems Protections Board studies spanning 14 years show that 42–44 percent of women in the federal workforce have experienced one or more of a list of potentially sexuallyharassing behaviors within the previous 24 months. In addition, the number of charges filed with the US Equal Employment Opportunity Commission may be leveling off in the range of 15,000–16,000 per year (in a labor force of about 120 million people). 3.2 The Prealence of Sexual Harassment of Men Men have been included in studies of sexual harassment from the very beginning. In her study of a random sample of working men and women in Los Angeles County, Barbara Gutek found that some time during their working lives, from 9 percent to 35 percent of men (depending on definition of harassment) had experienced some behavior initiated by one or more women that they considered sexual harassment. The US Merit Systems Protection Board studies found that from 14 percent to 19 percent of men in the Federal workforce experienced at least one episode of a sexually harassing experience (initiated by either men or women) within the previous two years. These surveys revealed that about one-fifth of the harassed men were harassed by another man. Data from the 1988 US Department of Defense Survey of Sex Roles in the Active Duty Military revealed that about onethird of the men (but about 1 percent of the women) who experienced at least one of nine types of uninvited, unwanted sexual attention during the previous 12 months, reported that the initiator was the same sex. The harassment of men by other men tends to be of two types: lewd comments that were considered offensive and attempts to enforce male gender role behaviors. Harassment by women is somewhat different, consisting of negative remarks and\or unwanted sexual attention.
Sexual Harassment: Social and Psychological Issues The available research on sexual harassment of men, admittedly much less than the research on women, suggests that many of the behaviors women might find offensive are not considered offensive to men when the initiators are women and\or they report few negative consequences. In addition, a disproportionate percentage of men’s most distressing experiences of sexual harassment come from other men. Presumably these are especially distressing because the recipient’s masculinity and\or sexual orientation are being called into question.
3.3 The Prealence of Sexual Harassment Among Other Groups Relatively few studies have either focused on or found consistent differences in the experience of sexual harassment beyond the consistent differential rate of sexual harassment of women vs. men. It may be the case that younger and unmarried women are somewhat more likely targets of sexual harassment than older and married women. Lesbian women may be more likely to be sexually harassed than heterosexual women, or they may simply be more likely to label their experiences sexual harassment. Although several authors have suggested that in the US women of color (Asian, African-American, Hispanic, and American Indian) are more likely to be targets of sexual harassment than Caucasian women, the evidence is far from clear. Several random sample surveys found no clear link between ethnicity and the experience of sexual harassment but qualitative studies suggest that women of color experience sexual harassment frequently. Two kinds of arguments have been advanced for reasons why minority women might experience relatively more sexual harassment. A direct argument relies on stereotyping of minorities. Although the stereotypes of African-American women differ from stereotypes of Chicanas or Asian-American women, in each case the stereotype might place these women at greater risk. An indirect argument relies on concepts of power and marginality. As women of color are less powerful and more marginal by virtue of their ethnicity than white women, they may be more prone to sexual harassment.
4. Explanations for Sexual Harassment Why sexual harassment exists has been of interest to social scientists since the phenomenon acquired a label. The various explanations can be subsumed into four categories: natural\biological perspectives, organizational perspectives, sociocultural explanations, and individual differences perspectives. These explanations tend to be broad in scope, not easily testable in a laboratory.
4.1 Natural\Biological Explanations There are two natural\biological perspectives: a hormonal model and an adaptive\evolutionary explanation. While intriguing, neither is supported by available data.
4.2 Organizational Explanations There are two organizational perspectives: sex-role spillover and organizational power. Sex-role spillover, defined as the carry over into the workplace of genderbased expectations that are irrelevant or inappropriate to work, occurs because gender role is more salient than work role and because under many circumstances, men and women fall back on sex role stereotypes to define how to behave and how to treat those of the other sex. Sex-role spillover tends to occur most often when the gender ratio is heavily skewed in either direction, i.e., when the job is held predominantly either by men or by women. In the first situation (predominantly male), nontraditionally employed women are treated differently than their more numerous male co-workers, are aware of that different treatment, report relatively frequent social-sexual behavior at work, and tend to see sexual harassment as a problem. In the second situation (predominantly female), female workers hold jobs that take on aspects of the female sex-role and where one of those aspects is sex object (e.g., cocktail waitress, some receptionists), women may become targets of unwanted sexual attention, but may attribute the way they are treated to their job, not their gender. Several studies find some support for this perspective. The earliest writings on sexual harassment were about men abusing the power that comes from their positions in organizations to coerce or intimidate subordinate women. Some subsequent statements on the power perspective are gender neutral, suggesting that although men tend to harass women, in principle if women occupied more positions of power, they might harass men in equal measure. This interpretation of sexual harassment as an abuse of organizational power is contraindicated by research showing that about half or more of harassment comes from peers. In addition, both customers and subordinates are also sources of harassment. Sexual harassment by subordinates has been documented primarily in academic settings where, for example, studies find up to half of female faculty at universities had experienced one or more sexually harassing behaviors by male students. While power cannot explain all sexual harassment, various kinds of power—formal organizational power and informal power stemming from ability to influence —remain potent explanations for at least some sexual harassment. For example, the fact that sexual harassment by customers is fairly common can be 13989
Sexual Harassment: Social and Psychological Issues explained, at least in part, by the emphasis employers place on customer satisfaction and the notion that the ‘customer is always right.’ Some researchers focus less on broad theoretical perspectives and more on the types of organizational factors that need to be included in models of sexual harassment. Thus far, contact with the other sex and a unprofessional and\or sexualized work environment have been identified as correlates of sexual harassment. 4.3 Sociocultural Explanations There are at least two ways of thinking of the broader sociocultural context. One is that behavior at work is merely an extension of male dominance that thrives in the larger society. Overall, there is general agreement in the literature about the characteristics of the sex stratification system and the socialization patterns that maintain it. The exaggeration of these roles can lead to sexual harassment. For example, men can sexually harass women when they are overly exuberant in pursuing sexual self-interest at work, or they feel entitled to treat women as sex-objects, or when they feel superior to women and express their superiority by berating and belittling the female sex. The second way of thinking of the broader sociocultural context is to study the sociocultural system itself and examine how and why status is assigned. According to this view, sexual harassment is an organizing principle of our system of heterosexuality, rather than the consequence of systematic deviance. 4.4 Indiidual Difference Explanations Although the data suggest that most sexual harassers are men, most men (and women) are not sexual harassers. This makes the study of personality characteristics particularly relevant. The search for individual-level characteristics of perpetrators does not negate any of the other explanations, but helps to determine, for example, which men in a maledominated society or which men in powerful positions in organizations harass women when most men do not. John Pryor developed a measure of the Likelihood to Sexually Harass (LSH) in men, consisting of 10 vignettes that place the respondent in a position to grant someone a job benefit in exchange for sexual favors. Currently the most widely known individual difference measure used in the study of sexual harassment, the LSH, has been validated in a number of studies. For example, undergraduate men who score relatively high on the LSH demonstrate more sexual behavior in a lab experiment and hold more negative attitudes toward women relative to those who report a lower likelihood to sexually harass. Sexual harassment may be an attempt to immediately gratify the desire for discrimination, intimidation, or sexual pleasure and therefore those people 13990
with low self-control may also be more likely to sexually harass. Some research findings support this gender-neutral explanation.
5. Factors Affecting Judgments of Sexual Harassment The most widely published area of research on sexual harassment measures people’s perceptions about sexual harassment. One set of studies attempts to understand which specific behaviors (e.g., repeated requests for a date, sexual touching, stares or glances, a sexually oriented joke) respondents consider to be sexual harassment. The other set of studies attempts to understand factors that affect the way respondents perceive behavior that might be considered sexual harassment. Typically, respondents are asked to read a vignette in which factors are manipulated and then they are asked to make judgments about the behavior in the vignette. The factors that are manipulated include characteristics of the behavior (e.g., touching vs. comments), characteristics of the situation (e.g., the relationship between the initiator and recipient), and characteristics of the initiator and recipient (e.g., sex, age, attractiveness, occupation). In addition, characteristics of the rater (e.g., sex, age) are typically measured. 5.1 The Effects of Rater Sex on Judgments of Sexual Harassment Sex is the most frequently studied feature in studies about perceptions of sexual harassment. In all, hundreds of studies have been done. These studies have been reviewed using traditional methods and metaanalyses. The conclusions of these reviews are two: (a) women consistently rate vignettes and specific behaviors more sexually harassing than men and (b) the average difference is small, raising questions about the practical significance of these results. Practical significance is important because in the United States it would appear that these studies have had an indirect influence on the Ninth Circuit’s 1991 decision, Ellison s. Brady. In that case, the court adopted a new legal standard in hostile environment cases of sexual harassment, the reasonable woman standard, which replaces the traditional reasonable person standard in that Circuit. Juries are asked to evaluate the events from the perspective of a reasonable woman: taking into account all the facts, would a reasonable woman consider the plaintiff to be sexually harassed? The predominant claim is the new standard would force judges and juries to look at the case from the perspective of the complainant, who is typically a woman. This would presumably make it less difficult for a plaintiff to make a convincing claim of hostile work environment harassment. While some recent research suggests that a reasonable woman standard as currently implemented may not result in different
Sexual Orientation and the Law judgments than the traditional reasonable person standard, the magnitude of the gender gap in perceptions about sexual harassment may not justify a change in standards, regardless of the effect of standard itself.
5.2 The Effects of Other Rater Characteristics on Judgments of Sexual Harassment Although gender is far and away the most widely studied factor in the study of sexual harassment perceptions, other factors have been studied. For example, when the initiator is higher status than the recipient, judges generally respond more positively toward the recipient, more negatively toward the initiator, and perceive more harassment than when the initiator is not a supervisor. In addition, several studies that compare students with workers find that students have a broader, more lenient view of social-sexual behavior relative to samples of workers who are typically somewhat older and have more work experience.
6. Some Remaining Issues While much is known, we lack a complete picture of sexual harassment. Must concern about sexual harassment eliminate any kind of dating flirtation at work? How responsible should employers be for the behavior of their employees? How should targets of harassing behavior respond in order to eliminate harassment without damaging their own career possibilities? See also: Gender and Place; Male Dominance; Sexual Attitudes and Behavior; Sexual Harassment: Legal Perspectives; Workplace Safety and Health
Bibliography Bowes-Sperry L, Tata J 1999 A multiperspective framework of sexual harassment. In: Powell G N (ed.) Handbook of Gender and Work. Sage Publications, Inc., Thousand Oaks, CA pp. 263–80 Brewer M B, Berk R A (eds.) 1982 Beyond Nine to Five: Sexual harassment on the job. Journal of Social Issues 38(4): 1–4 Estrich S 1991 Sex at work. Stanford Law Reiew 43: 813–61 Borgida E, Fiske S T 1995 (eds.) Gender stereotyping, sexual harassment, and the law. Journal of Social Issues 51(1) Franke K M 1997 What’s wrong with sexual harassment? Stanford Law Reiew 49: 691–772 Gutek B A 1985 Sex and the Workplace. Jossey-Bass Publishers, San Francisco Gutek B A, Done R 2001 Sexual harassment. In: Unger R K (ed.) Handbook of the Psychology of Women and Gender. Wiley, New York Harris V. 1993 Forklift Systems, 114 S. Ct. 367 MacKinnon C A 1979 Sexual Harassment of Working Women: A
Case of Sex Discrimination. Yale University Press, New Haven, CT O’Donohue W (ed.) 1997 Sexual Harassment: Theory, Research, and Treatment. Allyn and Bacon, Boston Pryor J B, McKinney K(Issue eds.) 1995 Special Issue: Research advances in sexual harassment. Basic and Applied Social Psychology 17(4) Stockdale M S 1996 Sexual Harassment in the Workplace: Perspecties, Frontiers, and Response Strategies. Sage Publications, Inc., Newbury Park, CA Welsh S 1999 Gender and sexual harassment. Annual Reiew of Sociology 25: 169–90 Williams C L, Giuffe P A, Dellinger K 1999 Sexuality in the workplace: Organizational control, sexual harassment, and the pursuit of pleasure. Annual Reiew of Sociology 25: 73–93
B. A. Gutek
Sexual Orientation and the Law Sexual orientation refers to any classification based on sexual desire or conduct, such as heterosexuality, same-sex sexuality, or bisexuality. It is a classification that includes all sexual orientations, just as race is a classification that includes all races (White, Black, Asian, etc.) However, just as discussions of race tend to focus on marginalized racial groups (such as people of color), discussions of sexual orientation commonly refer to minority sexual orientations, gay, lesbian, or bisexual. As a result, this article will reflect that focus on gay people. It will, however, also note the growing literature on heterosexuality and transgender people.
1. Homosexuality and Heterosexuality as Historically Contingent The study of the law relating to sexually marginalized people, also known as ‘queer legal theory,’ evolved in the 1980s and 1990s out of feminist legal theory, postmodern theory, and critical race theory. The most influential work informing queer legal theory is Michel Foucault’s History of Sexuality, Vol. 1 (Foucault 1978), which posited that sexual orientation is socially constructed rather than naturally or divinely ordained. This point has been further developed by Judith Butler’s performativity theory of sexual orientation and gender, which posits that identities are performed rather than biologically or divinely ordained (Butler 1990).
1.1 From Conduct to Status and Back to Conduct The social construction of sexual orientation is revealed by the different ways in which society and law have viewed same-sex sexuality. Different terms used 13991
Sexual Orientation and the Law to describe same-sex sexuality reflect these changes. Prior to the late nineteenth century, same-sex sexuality was viewed as sinful conduct, redeemable through repentance. The terms ‘homosexual’ and ‘heterosexual’ did not appear in English until 1892, along with the idea that sexual orientation was a status. These terms are products of German and English medical research on same-sex sexuality, which saw same-sex sexuality as a status, gave it a name, contrasted it with the previously unmarked status of heterosexuality, and assigned it social meaning as a constellation of characteristics that a person is (rather than particular sexual acts that one does). The terms ‘invert’ or ‘homosexual’ refer to this medicalized understanding of same-sex sexuality as a sickness rather than sinful conduct. Oscar Wilde’s 1895 trials and imprisonment for gross indecency are credited with introducing into popular culture the status of the male homosexual as effete, artistic, self-centered, and sybaritic. A third stage, beginning in the 1940s and 1950s, relied on the scientific research of Alfred Kinsey and Evelyn Hooker to contend that same-sex sexuality was a normal variation from opposite-sex sexuality, rather than a disease (Hooker 1957, Kinsey et al. 1948). The nascent homophile movement adopted the term ‘gay’ to distance itself from the medical judgment associated with ‘homosexual.’ A fourth stage, emerging through queer theory in the 1990s, rejected the very notion of sexual orientation status (evil or neutral), pointing out the indeterminate and potentially subordinating qualities associated with status-based notions of identity. This stage dubbed minority sexual orientations ‘queer’ (reclaiming the epithet), treating this anti-status as an epistemological category. To be queer is to believe that racial, sexual, and sexual orientation subordination is wrong. Queer theory arguably uncouples status from conduct, so that one person can be both queer and actively heterosexual.
‘Transgender’ generally refers to gender norm transgression. Some transgender people, known as transsexuals, undergo medical treatment to change from one sex to another. Other categories of transgender people include transvestites (people who cross-dress but do not undergo medical gender reassignment), and transgendered people (those who do not conform to gender norms for the sex they were assigned at birth, but neither do they undergo medical treatment to reassign their sex). Some cultures do not draw bright lines to distinguish same-sex sexuality from gender identity issues. In Bolivia, for example, men who engage in same-sex sexuality do so within a cultural understanding that one of the participants, in some sense, is female. Men who engage in same-sex sexuality call themselves gente de ambiente, or people of the atmosphere (West and Green 1997). Three subcategories of gente de ambiente highlight the overlap between sexual orientation and transgender identity in Bolivian culture. Traestis, or transvestites, view themselves as women trapped in men’s bodies. Camuflados, the camouflaged, see themselves as men, but take the receptive role in intercourse, and often pass as heterosexual in public spaces. The third category, hombres, or just men, take the penetrating role in intercourse, and are distinguished from heterosexual men in that they respond to sexual overtures from other men. In Bolivia, and elsewhere, these sexual orientation and gender categories are fluid. A man might sleep with men exclusively, only occasionally, or for money. He might desire to sleep with men, but refrain for fear of social or legal condemnation. The very difficulty in deciding the characteristics of identity categories forms the thesis of much postmodern scholarship that identity generally, and sexual identity in particular, is indeterminate because it varies historically, culturally, and within a particular individual’s lifetime.
1.2 Fluidity Between Sexual Orientation and Gender Identity
2. Changing Focus of Sexual Orientation Research
Sexual orientation and gender issues overlap, but are also distinct in important ways. Both are associated with gender nonconformity. They differ, however, in their fundamental premises. Conventional approaches to sexual orientation presuppose two sexes, and categorize a woman as lesbian if she is with another woman (homosexual literally meaning same-sex), and heterosexual if she is with a man (heterosexual literally meaning different-sex). Transgender theory, in contrast, posits that there are more than two sexes, and that people exist along a continuum of masculinities and femininities. This reasoning renders gay theory’s primary assumption absurd; to speak of a person being gay or heterosexual ceases to make sense when there are many more than the two options of being a woman-with-a-woman or woman-with-a-man.
Sexual orientation law is a synergy of gay rights advocacy and theoretical academic writings. Advocates bring cases and support legislation countering anti-gay discrimination. Scholars both craft theories that inform these efforts and evaluate the cases and statutes that become law. Two important areas involve anti-sodomy laws and the ban on same-sex marriage.
13992
2.1 Anti-sodomy Laws Because the criminalization of same-sex sexuality is the most obvious expression of state condemnation, sexual orientation law began by challenging antisodomy laws. In the US case Bowers vs. Hardwick, 478 US 186 (1986), the US Supreme Court upheld Geor-
Sexual Orientation and the Law gia’s statute criminalizing sodomy against a constitutional privacy challenge. The Court issued extraordinarily homophobic declarations, such as the concurring opinion’s citation of Blackstone (1859) to describe same-sex sodomy as ‘an offense of ‘‘deeper malignity’’ than rape, a heinous act ‘‘the very mention of which is a disgrace to human nature.’’’ The gratuitous nastiness inspired both numerous critiques of the decision (Goldstein 1988, Thomas 1992) and challenges to other laws disadvantaging gay people. One strand of this critique challenges the validity of anti-sodomy statutes, arguing that sodomy is an historically contingent category. For example, early Roman–Dutch law (imported to Colonial South Africa) interpreted sodomy as including a wide range of nonconforming sexual acts, such as anal penetration, bestiality, masturbation, oral penetration, penetration with an inanimate object, interfemoral intercourse, and heterosexual intercourse between a Jew and a Christian (West and Green 1997, p. 6). This insight undermines the logic of sodomy law defenders, who, along with the US Supreme Court in Bowers vs. Hardwick, rest their defense on ‘millennia of moral teaching’ (Goldstein 1988). If what counts as reprehensible conduct differs markedly among people, places, and times, theorists reason, then one cannot invoke a uniform condemnation of the conduct. While many governments have decriminalized same-sex sexuality, the criminal ban remains strong in some places. In 2000 in Malaysia, for example, former deputyPrime Minister AnwarIbrahim was convicted of sodomy and sentenced to nine years in prison after a 14-month trial. At the opposite extreme, the 1996 South African Constitution explicitly forbade sexual orientation discrimination. The ban on gays in the military, like anti-sodomy laws, excludes gay people from full citizenship. While the ban remains in the US, other countries, such as Israel, allow gay people to serve in the military.
Hungary, Iceland, Norway and Sweden each recognize some type of partnership, which accords same-sex couples (and sometimes opposite-sex unmarried couples) various benefits accorded to married couples. Parts of Canada and Spain also recognize same-sex partnerships, and domestic partners enjoy some employment benefits in Argentina, Canada, Israel, New Zealand, and South Africa. Other countries, such as Australia, Austria, Belgium, Brazil, the Czech Republic, Germany, Portugal, Spain, and the UK recognize narrower rights, such as succession rights in private housing and some inheritance rights. Australia, New Zealand, Belgium, Denmark, Finland, Germany, Iceland, The Netherlands, Norway, South Africa, Sweden, and the UK recognize same-sex relationships for immigration purposes. Smaller governmental units, such as cities, counties, and provinces around the world, ban discrimination on the basis of sexual orientation. In the USA, only one state, Vermont, accords samesex relationships recognition akin to marriage; the federal government and a majority of the states explicitly refuse to recognize same-sex marriage. The Vermont Supreme Court in Baker vs. Vermont, 744 A.2d 864 [Vt. 1999] found that banning same-sex marriage (or its equivalent under another name) violated the Common Benefits Clause of the Vermont constitution. That clause provides that ’government is, or ought to be, instituted for the common benefit, protection, and security of the people, nation, or community, and not the particular emolument or advantage of any single person, family, or set of persons, who are only a part of that community.’ Following the court’s instructions, the Vermont legislature created a structure parallel to marriage for same-sex couples, calling it civil union.
2.2 The Ban on Same-sex Marriage
The emphasis in current theory and research on sexual orientation law is to deconstruct and reconstruct legal regulations that subordinate gay and transgendered people. This rubric includes various ideological and methodological approaches.
Although anti-sodomy laws are rarely enforced, their existence impedes other anti-discrimination claims, such as gay people’s rights to marry or to serve in the military. Since the 1980s, many countries (i.e., South Africa) and US states (i.e., Georgia) have decriminalized same-sex sexuality. Legal advocacy and scholarship has moved on to contest discrimination in other areas. In Romer vs. Eans, 517, US 620 (1996), the US Supreme Court invalidated a state constitutional amendment that forbade any state entity from protecting gay, lesbian, or bisexual people from discrimination. Advocates and legal scholars, however, have paid far more attention to marriage litigation. In 2001, the Netherlands became the first country to allow same-sex couples to marry. Other countries such as Denmark, France, Greenland,
3. Theoretical Approaches to Sexual Orientation Law
3.1 Ideology Ideological diversity includes both classical liberal and critical approaches. Liberal approaches suggest that gay people are sufficiently similar to heterosexuals to justify equal legal treatment. This approach suggests that altering legal rules to lift the ban on gay participation in marriage or the military will not fundamentally change these institutions. A critical approach, in contrast, contests essentialized notions of identity, seeing it as socially constructed rather than 13993
Sexual Orientation and the Law reflecting essential commonalties among people who engage in same-sex sexuality. This critical approach often seeks a more comprehensive restructuring of legal regulations than simply adding gay people to existing institutions. Instead, critical approaches propose incorporating sexual orientation analysis into other antisubordination discourses, such as feminism, critical race studies, and class analysis. Critical theorists might further propose alternatives to marriage, such as domestic partnership for all people, contending that only new institutions can alleviate the sex\gender hierarchies inherent in marriage.
3.2 Methodology Postmodern, empirical, and legal economic approaches have contributed to the literature on sexual orientation law. One empirical approach compiles data about rates of arrest and prosecution for sexual offenses, and points out that consensual same-sex sexual activity is unfairly singled out for criminal penalty (Eskridge 1999). The two major premises of this approach, that like parties should be treated alike, and further that the state should generally refrain from interfering in private consensual activities, together lead to the conclusion that the state statutes criminalizing sodomy should be invalidated. One postmodern approach closely examines the language and reasoning of a judicial opinion to decode the cultural context of the decision, revealing, for example, the Supreme Court’s strategic and biased collapse of status and conduct in Bowers vs. Hardwick. This approach focuses on how the majority decision in Hardwick collapsed the distinction between status and conduct, and ignored the many cross-sex couples who commit sodomy by engaging in acts such as fellatio, cunnilingus, and anal intercourse. The majority in Hardwick framed the question as ‘whether the Federal Constitution confers a fundamental right upon homosexuals to engage in sodomy’ focusing the legal inquiry on both a bad act (sodomy) and a bad status (homosexuality). Under this reasoning: [I]f sodomy is bad … then homosexuals and heterosexuals who do it are bad. If homosexuals are bad, we are bad whether we’ve engaged in sodomy or not. To hold both of these positions with consistency, you have to be willing to say that many, many heterosexuals are bad. This the majority Justices never acknowledged … They wanted the badness of each to contaminate the other—while heterosexual personhood remained out of the picture, protected from the taint with which it was logically involved. (Halley 1999, pp. 7–8)
An emerging strand of legal scholarship uses legal economic premises to examine heterosexuality as well as same-sex sexuality. US Court of Appeals Judge Richard Posner’s book Sex and Reason (1992 posited a bioeconomic theory of sexuality that combined sociobiology with legal economics. Posner’s often controversial analysis (describing, for example, the economic efficiency in some contexts of female infanticide and baby selling, which he renames ‘parental-right selling’) provoked a firestorm of response. Yet recent scholarship has built on Posner’s economic approach to posit a bargaining theory of sexuality, suggesting that legal regulation should equalize the bargaining positions between men and women (Hirshman and Larson 1998).
4. Future Directions of Theory and Research The youth of sexual orientation legal research makes it unpredictable. Four likely future trends emerge. First, future research may well include increased use of empirical methods such as compilations data from court records in criminal or family law cases. Second, queer legal theory’s emphasis on post-identity analysis and legal doctrine’s focus on identity as a foundational category require that future scholarship address the tensions between deconstructing identity and reconstructing a legal regime that does not discriminate on the basis of sexuality or gender performance. Third, scholars are likely to try to resolve the analytical tension between a binary construction of sex that underlies gay theory and a fluid construction of sex that underlies transgender approaches. Fourth and finally, future scholarship and advocacy may focus on the legal regulation of heterosexuality and bisexuality, further developing the literature on how legal regulation misconstrues identity as fixed or natural. On an ideological level, theoretical research is likely to continue to develop in both liberal (assimilationist) and critical (utopian) strands. See also: Civil Rights; Family Law; Feminist Legal Theory; Gay\Lesbian Movements; Gender and the Law; Privacy: Legal Aspects; Queer Theory; Regulation: Sexual Behavior; Sex Segregation at Work; Sexual Attitudes and Behavior; Sexual Orientation: Historical and Social Construction; Social Class and Gender
Bibliography By leaving the categories of sexual orientation status and conduct ambiguous, legal doctrine governing consensual sexuality ‘remained always ready to focus on ‘‘act’’ or ‘‘status’’ according to the expediencies of the situation’ (Halley 1999, p. 8) 13994
Blackstone W 1859 Commentories. Harpe Brothers, New York, Vol. IV Butler J 1990 Gender Trouble: Feminism and the Subersion of Identity. Routledge, New York Eskridge W 1999 Gaylaw: Challenging the Apartheid of the Closet. Harvard University Press, Cambridge, MA
Sexual Orientation: Biological Influences Foucault M 1978 The History of Sexuality, Vol. I: An Introduction. Pantheon, New York Goldstein A 1988 History, homosexuality and political values: Searching for the hidden determinants of Bowers v. Hardwick. Yale Law Journal. 97: 1073–103 Halley J 1999 Don’t: A Reader’s Guide to the Military’s Anti-Gay Policy. Duke University, Durham, NC Hirshman L, Larson J 1998 Hard Bargains: The Politics of Sex. Oxford University Press, New York Hooker E 1957 The adjustment of the male overt homosexual. Journal of Projectie Techniques 21: 18–31 International Lesbian and Gay Association, http:\\ www.ilga.org\ Kinsey A, Pomeroy W, Martin C 1948 Sexual Behaior in the Human Male. W. B. Saunders, Philadelphia, PA Posner R 1992 Sex and Reason. Harvard University Press, Cambridge, MA Robson R 1992 Lesbian (Out)law: Surial Under the Rule of Law. Firebrand Books, Ithaca, NY Rubinstein W 1997 Sexual Orientation and the Law (2nd edn.). West, St. Paul, MN Symposium: InterSEXionality: Interdisciplinary perspectives on queering legal theory. 1998. Dener Uniersity Law Reiew 75: 1129–464. Thomas K 1992 Beyond the privacy principle. Columbia Law Reiew 92: 1431–516 Valdes F 1995 Queers, sissies, dykes and tomboys: Deconstructing the conflation of ‘sex,’ ‘gender,’ and ‘sexual orientation’ in Euro-American law and society. California Law Reiew 83: 1–377 West D, Green R 1997 Sociolegal Control of Homosexuality: A Multinational Comparison. Plenum Press, New York
M. M. Ertman
Sexual Orientation: Biological Influences How humans develop sexual orientations is a question that people have pondered for at least a century. In the later part of the twentieth century, scientists began to articulate sophisticated biological theories of how sexual orientations develop. This article critically surveys contemporary biological theories of the development of sexual orientations. In particular, it examines three recent studies of how sexual orientations develop and their theoretical underpinnings. The focus is on these studies, because not only are they positively cited by almost every scientist trying to develop a biological theory of sexual orientation, but also because they are typical in their assumptions and methodology.
1. What Is a Sexual Orientation? A person’s sexual orientation concerns his or her sexual desires and fantasies towards others in virtue of their sex (gender). However, a person’s sexual orien-
tation is only one part of a person’s sexual interest generally. People have a wide range of sexual tastes. Some are attracted to people of certain ages, people of certain body types, of certain races, of certain hair colors, of certain personality types, of certain professions, as well as to people of a certain sex and a certain sexual orientation. Further, people are not only sexually interested in certain sorts of people, some also have quite specific interests in certain sorts of sexual acts, certain venues for sex, or a certain frequency of having sex. We recognize that people can be sorted into all sorts of groups in virtue of their sexual interests, but most contemporary scientific studies focus only on the sex of the people a person is sexually attracted to as an essential feature about him or her. Doing so may be culturally salient but it is not scientifically justified.
2. What Makes a Theory a Biological One? To say that sexual orientation is biologically based is an ambiguous claim; there are various senses in which it is trivially true that sexual orientation is biological. Everything psychological is biologically based. Humans can have sexual orientations while inanimate objects and one-celled organisms cannot because of our biological\psychological make-up. The same sort of claim is true with respect to having a favorite type of music: humans but not single-celled organisms can have a favorite type of music. Even though a preference for classical music seems a paradigmatic example of a learned trait, such a preference is also biological in that a certain cognitive complexity is required in order to have such a preference. Sexual orientation is at least biologically based in the sense that musical preferences are. The central claim of biological research on sexual orientation makes a much bolder claim. It says a person’s sexual orientation is inborn or determined at a very early age and, as a result, a person’s sexual orientation is ‘wired into’ his or her brain. To understand the significance of such a claim, it is useful to contrast three models of the role genes and other biological factors might play in sexual orientation (Byne 1996, Stein 1999, pp. 123–7). According to the permissie model, genes or other biological factors influence neuroanatomical structures on which experience inscribes sexual orientation, but biological factors do not directly or indirectly determine sexual orientation. Something like the permissive model correctly describes the development of musical preferences. Various genetic and biological factors make it possible for our experiences with various kinds of music to shape our musical preferences. Contrast the permissive model with the direct model, according to which genes, hormones, or other biological factors directly influence the brain structures that underlie sexual orientation. According to the 13995
Sexual Orientation: Biological Influences direct model, the neurological structures responsible for the direction of a person’s sexual attraction toward men or women develop at a very early age as a result of a person’s genes or other biological factors. One version of the direct model sees genes in the q28 region of the X chromosome as coding for a set of proteins that causes the INAH-3 region of the hypothalamus to develop so as to determine a person’s sexual orientation (LeVay and Hamer 1994). The direct and the permissive models can be contrasted with the indirect model, according to which genes code for (and\or other biological factors influence) temperamental or personality factors that shape how a person interacts with his or her environment and experiences of it, which, in turn, affects the development of his or her sexual orientation. On this view, the same gene (or set of genes) might predispose to homosexuality in some environments, to heterosexuality in others, and have no effect on sexual orientation in others. An example of such a theory is Daryl Bem’s theory of sexual orientation according to which biological factors code for childhood personality types and temperaments (for example, aggressiveness, willingness to engage in physical contact, and so on) (Bem 1996, Peplau et al. 1998). In societies like ours where there are significantly different gender roles typically associated with men and women, these different personality and temperament types get molded into gender roles that, in turn, play a crucial role in the development of sexual orientation. The three biological theories of the development of sexual orientation discussed below accept the direct model. However, what evidence there is for these theories is equally consistent with the indirect model.
3. Is Sexual Orientation Wired into the Brain? In 1991, Simon LeVay published a study of the size of INAH-3, a particular cell group in the hypothalamus (LeVay 1991). Starting from the assumption that there are neurological differences between men and women, LeVay decided to look for sexual-orientation differences in some of the areas of the hypothalamus that seem to exhibit sex differentiation. He reasoned as follows: given that most people who are primarily attracted to women are men and most people who are primarily attracted to men are women, in order to discover where sexual orientation is reflected in the brain, we should look in parts of the brain that are structured differently for men and women. This picture is based on seeing gay men as having female-typical characteristics and seeing lesbians as having maletypical characteristics. Seeing gay men and lesbians as gender inverts has a certain cultural salience but its scientific merit has been subject to serious criticism (Stein 1999, pp. 202–5, Byne 1994). To examine the hypothalamus, LeVay had to study portions of human brain tissue that are accessible only 13996
after the person has died. Further, LeVay needed to know the sexual orientations of the people associated with the brain tissue he was studying. LeVay’s study was made possible as a result of the AIDS epidemic, which had the result of making available brains from people whose self-reported sexual histories are to some extent part of their medical records. LeVay examined 41 brains: 19 of them from men LeVay presumed to be gay because they had died of complications due to AIDS and their medical records suggested that they had been exposed to HIV (the virus that causes AIDS) through sexual activity with other men; six of them from men of undetermined sexual orientation who also died of AIDS and who LeVay presumed were heterosexual; 10 of them from men of undetermined sexual orientation who died of causes other than AIDS and who were also presumed to be heterosexual; and six of them from women all of whom were presumed to be heterosexual, one whom died of AIDS and five who died from other causes. LeVay found that, on average, the INAH-3 of the presumed gay men were significantly smaller than those of the presumed heterosexual men and about the same size as those of the women. From this, he inferred that gay men’s INAH-3 are in a sense ‘feminized.’ Although LeVay rather cautiously concluded that his ‘results do not allow one to decide if the size of INAH-3 in an individual is the cause or consequence of that individual’s sexual orientation or if the size of INAH-3 and sexual orientation co-vary under the influence of some third unidentified variable,’ he also said that his study illustrates that ‘sexual orientation in humans is amenable to study at the biological level’ (LeVay 1991, p. 1036). In media interviews after the publication of his study he made even stronger claims; he said, for example, that the study ‘opens the door to find the answer’ to the question of ‘what makes people gay or straight’ (Gelman 1992).
4. Is Sexual Orientation Inherited? Various studies suggest that sexual orientation runs in families (e.g., Pillard and Weinrich 1986). Such studies show that a same-sex sibling of a homosexual is more likely to be a homosexual than a same-sex sibling of a heterosexual is to be a homosexual; more simply, for example, the brother of a gay man is more likely to be gay than the brother of a straight man. These studies do not establish that sexual orientation is genetic because most siblings, in addition to sharing a significant percentage of their genes, share many environmental variables, that is, they are raised in the same house, are fed the same meals, attend the same schools and have many of the same adult role models. For these reasons, disentangling inherited and environmental influences requires more sophisticated studies. Heritability studies done by Michael Bailey and Richard Pillard assessed sexual orientation in identical
Sexual Orientation: Biological Influences twins, fraternal twins, nontwin biological siblings, and similarly-aged unrelated adopted siblings (Bailey and Pillard 1991, Bailey et al. 1993). If sexual orientation is genetic, then, first, all identical twins should have the same sexual orientation and, second, the rate of homosexuality among the adopted siblings should be equal to the rate of homosexuality in the general population. If, on the other hand, identical twins are as likely to have the same sexual orientation as adopted siblings, this suggests that genetic factors make very little contribution to sexual orientation. In both twin studies, subjects were recruited through ads placed in gay publications that asked for homosexual or bisexual volunteers with twin or adoptive siblings. Volunteers were encouraged to reply to the ad ‘regardless of the sexual orientation’ of their siblings. In both of these studies, the percentage of identical twins who are both homosexual is substantially higher than the percentage with respect to fraternal twins. For example, 48 percent of the identical twins of lesbians were also lesbians, 16 percent of the fraternal twin sisters were lesbians, 14 percent of the nontwin biological sisters were lesbians, as were 6 percent of adoptive sisters. These results show that sexual orientation is at least partly not the result of genetic factors. However, the higher concordance rate is consistent with a genetic effect because identical twins share all of their genes while fraternal twins, on average, share only half of their genes. Also consistent with a genetic effect is the result that the concordance rates for both types of twins are higher than the concordance rates for adopted siblings.
5. Is Sexual Orientation Genetic? Building on heritability studies, Dean Hamer and his collaborators obtained DNA samples from gay brothers in families in which homosexuality seemed surprisingly common. These samples were analyzed using linkage analysis, a technique for narrowing the location of a gene for some trait, to see if there was any particular portion of the X chromosome that was identical in the pairs of brothers at an unexpectedly high frequency (Hamer et al. 1993). Hamer found that a higher than expected percentage of the pairs of gay brothers had the same genetic sequences in a particular portion of the q28 region of the X chromosome (82 percent rather than the expected 50 percent). In other words, he found that gay brothers are much more likely to share the same genetic sequence in this particular region than they are to share the same genetic sequence in any other region of the X chromosome. Hamer’s study suggests that the q28 region is the particular place where sexual orientation differences are inscribed. This study does not, contrary to popular belief, claim to identify any particular genetic sequence associated with homosexuality. At best, it has found that many of the pairs of homosexual brothers
had the same genetic sequences in this portion of the X chromosome. When Hamer is at his most precise, he says that his study shows that ‘at least one subtype of male sexual orientation is genetically influenced’ (Hamer et al. 1993, p. 321). In other contexts, Hamer is less careful. For example, in his book, Science of Desire, he talks of ‘gay genes,’ going so far as to use this term in the book’s subtitle (Hamer and Copeland 1994).
6. Problems with These (and Other) Studies 6.1 Lack of Confirmation Independent confirmation is the earmark of the scientific method. LeVay’s results have not been independently confirmed (Byne 1996) and Hamer’s study has been recently disconfirmed (Rice et al. 1999). Although various research teams have confirmed the twin study results, a recent and more sophisticated study by Bailey undermines the methodology of early twin studies (Bailey et al. in press). Bailey systematically recruited subjects from a registry of identical and fraternal twins in Australia. In women, the percentage of identical twins of bisexuals and homosexuals who are also either bisexual or homosexual was between 24 and 30 percent (depending on how the boundaries of these groups are drawn), while the percentage of same-sex fraternal twins of bisexual and homosexual women was between 10 and 30 percent. Not only is the difference between identical and fraternal twins significantly smaller than in previous studies but the percentages for identical twins with the same sexual orientations are dramatically lower. Although these results can be read as consistent with the direct model, the evidence is much weaker than earlier heritability studies suggested.
6.2 Problems with the Subject Pool This Australian study shows that the results of earlier twin and family studies (Bailey and Pillard 1991, Bailey et al. 1993, Hamer et al. 1993), which recruited subjects through HIV clinics, lesbian and\or gay organizations, newspapers, and other nonsystematic methods, must have been inflated by sampling bias. In particular, it suggests that gay men and lesbians with identical twins of the same sexual orientation are more likely to participate in such studies. If an experiment makes use of a subject pool that is in some way biased, then this gives rise to doubts about the conclusions based on it. LeVay’s subject pool is also biased. In particular, the homosexual population in his study is made up exclusively of men who died from complications due to AIDS and who told hospital staff that they had engaged in same-sex sexual activities. 13997
Sexual Orientation: Biological Influences 6.3 Methods of Classification A related problem with such studies concerns the determination of subjects’ sexual orientations. Conclusions based on studies that inaccurately assign sexual orientations to subjects are weak. LeVay assumed, on the basis of no particular evidence, that all the women in his study were heterosexual and that all the men in his studies whose hospital records did not indicate same-sex sexual activity were heterosexual. Other studies that rely on a family member to report a person’s sexual orientation may be similarly problematic. Further, by assuming that there are two sexual orientations—gay and straight—and that people can be easily and reliably classified as one or the other on the basis of their behavior or their self-report, scientific research accepts, without strong justification, that our cultural presumptions about human sexual desires are scientifically valid (Stein 1999, pp. 201–13). 6.4 Undefended Assumptions Generally, most biological research on sexual orientation accepts without argument a quite particular picture of sexual orientation. For example, many studies in the emerging research program unquestioningly accept the inversion assumption, according to which lesbians and gay men are seen as gender inverts and many studies assume the direct relevance of animal models of sexual behavior to human sexual orientation (Stein 1999, pp. 164–79). More crucially, such studies typically accept that a person’s sexual orientation is a deep scientific property about her and that sexual orientation is a ‘window into a person’s soul.’ This view of the centrality of sexual orientation to human nature is neither culturally universal nor scientifically established (Stein 1990, Stein 1999, pp. 93–116).
7. Conclusion Although there is some evidence that is consistent with biological factors playing an indirect role in the development of sexual orientations, there is no convincing evidence that biological factors are a direct cause of sexual orientation. How human sexual desires develop is an interesting research question, but we are probably rather far from answering it. See also: Culture as Explanation: Cultural Concerns; Feminist Theory: Radical Lesbian; Sexual Preference: Genetic Aspects
Bibliography Bailey J M, Pillard R C 1991 A genetic study of male sexual orientation. Archies of General Psychiatry 48: 1089–96
13998
Bailey J M, Pillard R C, Neale M C, Agyei Y 1993 Heritable factors influence sexual orientation in women. Archies of General Psychiatry 50: 217–23 Bailey J M, Dunne M P, Martin N G in press The distribution, correlates, and determinants of sexual orientation in an Australian twin sample Bem D J 1996 Exotic becomes erotic: A developmental theory of sexual orientation. Psychological Reiew 103: 320–35 Byne W 1994 The biological evidence challenged. Scientific American 270(May): 50–5 Byne W 1996 Biology and sexual orientation: Implications of endocrinological and neuroanatomical research. In: Cabaj R, Stein T (eds.) Textbook of Homosexuality and Mental Health. American Psychiatric, Press, Washington, DC Gelman D 1992 Born or bred? Newsweek, February 24: 46–53 Hamer D, Copeland P 1994 The Science of Desire: The Search for the Gay Gene and the Biology of Behaior. Simon and Schuster, New York Hamer D H, Hu S, Magnuson V L, Hu N, Pattatucci A 1993 A linkage between DNA markers on the X chromosome and male sexual orientation. Science 261: 321–7 LeVay S 1991 A difference in hypothalamic structure between heterosexual and homosexual men. Science 253: 1034–7 LeVay S, Hamer D H 1994 Evidence for a biological influence in male homosexuality. Scientific American 270(May): 44–9 Peplau L A, Garnets L D, Spalding L R, Conley T D, Veniegas R C 1998 A critique of Bem’s ‘exotic becomes erotic’ theory of sexual orientation. Psychological Reiew 105: 387–94 Pillard R C, Weinrich J 1986 Evidence of familial nature of male homosexuality. Archies of General Psychiatry 43: 808–12 Rice G, Anderson C, Risch N, Ebers G 1999 Male homosexuality: Absence of linkage to microsatellite markers at xq28. Science 284: 665–7 Stein E D (ed.) 1990 Forms of Desire: Sexual Orientation and the Social Constructionist Controersy. Garland, New York Stein E D 1999 The Mismeasure of Desire: The Science, Theory, and Ethics of Sexual Orientation. Oxford University Press, New York
E. Stein
Sexual Orientation: Historical and Social Construction Sexual orientation is a much used but fundamentally ambiguous concept. It came into use in discussions of sexuality in the 1970s largely as a synonym for homosexual desire and object choice, less frequently for heterosexual patterns. ‘Sexual orientation’ suggests an essential sexual nature. The task of historical or social constructionist approaches is to suggest that this belief is what itself needs investigation. Constructionist approaches seek to do two broad things: to understand the emergence of sexual categorizations (such as ‘the homosexual’ or ‘the heterosexual’ in western cultures since the nineteenth century) within their specific historical and cultural contexts; and to interpret the sexual meanings, both subjective and social, which allow people to identify with, or reject,
Sexual Orientation: Historical and Social Construction these categorizations. It is, thus, largely preoccupied, not with what causes individual desires or orientations, but with how specific definitions develop within their historic contexts, and the effects these definitions have on individual self-identifications and collective meanings.
1. The Rise of Constructionist Approaches to Sexuality The classical starting point for social constructionist approaches is widely seen as an essay on The Homosexual Role by the British sociologist, Mary McIntosh (1968). Its influence can be traced in a range of historical studies from the mid-1970s (Weeks 1977, Greenberg 1988), and it has been anthologized frequently (e.g., Stein 1992). What is important about the work is that it asks what was at the time a new question: not, as had been traditional in the sexological tradition from the late nineteenth century, what are the causes of homosexuality, but rather, why are we so concerned with seeing homosexuality as a condition that has causes? And in tackling that new question, McIntosh proposed an approach that opened up a new research agenda through seeing homosexuals ‘as a social category, rather than a medical or psychiatric one.’ Only in this way, she suggests, can the right questions be asked, and new answers proposed. Using Kinsey (Kinsey et al. 1948), McIntosh makes a critical distinction between homosexual behavior and ‘the homosexual role.’ Homosexual behavior is widespread; but distinctive roles have developed only in some cultures, and do not necessarily encompass all forms of homosexual activity. The creation of a specialized, despised, and punished role or category of homosexual, such as that which developed in Britain from the early eighteenth century, was designed to keep the bulk of society pure in rather the same way that the similar treatment of some kinds of criminal keeps the rest of society law-abiding. McIntosh drew on a variety of intellectual sources, from structural functionalism to dramaturgical approaches, but clearly central to her argument was a form of labeling theory. The creation of the homosexual role was a form of social control designed to minoritize the experience, and protect and sustain social order and traditional sexual patterns. If McIntosh put on the agenda the process of social categorization, another related but distinctive approach that shaped social constructionism came from the work of Gagnon and Simon, summarized in their book Sexual Conduct: The Social Sources of Human Sexuality (1974). Drawing again on the work of Kinsey and symbolic interactionist traditions, they argued that, contrary to the teachings of sexology, sexuality, far from being the essence of ‘the natural,’ was subject to sociocultural shaping to an extra-
ordinary degree. The importance culture attributes to sexuality may, therefore, they speculated, not be a result of its intrinsic significance: society may have had a need to invent its importance and power at some point in history. Sexual activities of all kinds, they suggested, were not the results of an inherent drive but of complex psychosocial processes of development, and it is only because they are embedded in social scripts that the physical acts themselves become important. These insights suggested the possibility of exploring the complex processes by which individuals acquired subjective meanings in interaction with significant others, and the effects of ‘sexual stigma’ on these developmental processes (Plummer 1975). By the mid-1970s, it is possible to detect the clear emergence of a distinctive sociological account, with two related concerns. One focused on the social categorization of sexuality, asking questions about what historical factors shaped sexual differences which appeared as natural but were in fact cultural. The other was concerned primarily with methods of understanding the shaping of subjective meanings through sexual scripting, which allowed a better understanding of the balance between individual and collective sexual meanings. A third theoretical element now came into play: that represented by the work of Foucault (1976\1979). Foucault’s essay is often seen, misleadingly, as the starting point of constructionist approaches, but there can be no doubt of the subsequent impact of what was planned as a brief prolegomena to a multivolumed study. Like Gagnon and Simon, Foucault appeared to be arguing that ‘sexuality’ was a ‘historical invention.’ Like McIntosh, and others who had been influenced by her, he saw the emergence of the concept of a distinctive homosexual personage as a historical process, with the late nineteenth century as the key moment. The process of medicalization, in particular, was seen as a vital explanatory factor. Like McIntosh, he suggested that psychologists and psychiatrists have not been objective scientists of desire, as the sexological tradition proclaimed, but on the contrary ‘diagnostic agents in the process of social labeling’ (McIntosh 1968). But at the same time, his suggestion that people do not react passively to social categorization—‘where there is power, there is resistance’—left open the question of how individuals react to social definitions, how, in a word that now became central to the debate, identities are formed in the intersection of social and subjective meanings.
2. Sexual Behaior, Sexual Categories, and Sexual Identities The most crucial distinction for social constructionism is between sexual behavior, categories, and identities. Kinsey et al (1948) had shown that there was no 13999
Sexual Orientation: Historical and Social Construction necessary connection at all between what people did sexually and how they identified themselves. If, in a much disputed figure, 37 per cent of the male population had had some sort of sexual contact with other men to the point of orgasm, yet a much smaller percentage claimed to be exclusively homosexual, identity had to be explained by something other than sexual proclivity or practice. Yet at the same time, by the 1970s, many self-proclaimed homosexuals were ‘coming out,’ in the wake of the new lesbian and gay movement. Many saw in the historicization of the homosexual category a way of explaining the stigma that homosexuality carried. What was made in history could be changed in history. Others, however, believed clearly that homosexuality was intrinsic to their sense of self and social identity, essential to their nature. This was at the heart of the so-called social constructionist–essentialist controversy in the 1970s and 1980s (Stein 1992). For many, a critique of essentialism could also be conceived of as an attack on the very idea of a homosexual identity, a fundamental challenge to the hard won gains of the lesbian and gay movement, and the claim to recognition of homosexuals as a legitimate minority group. This was the source of the appeal of subsequent theories of a ‘gay gene’ or ‘gay brain,’ which suggested that sexual orientation was wired into the human individual. It is important to make several clear points in response to these debates, where social scientific debates became a marker of social movement differences. First, the distinction between behavior, categories, and identities need not necessarily require the ignoring of questions of causation, it merely suspends them as irrelevant to the question of the social organization of sexuality. Foucault himself stated that: ‘On this question I have absolutely nothing to say’ (cited in Halperin 1995). The really important issue is not whether there is a biological or psychological propensity that distinguishes those who are sexually attracted to people of the same gender from those who are not. More fundamental are the meanings these propensities acquire, however, or why ever they occur, the social categorizations that attempt to demarcate the boundaries of meanings, and their effect on collective attitudes and individual sense of self. Social categorizations have effects in the real world, whether or not they are direct reflections of inherent qualities and drives. The second point to be made is that the value of the argument about the relevance of theories of a ‘homosexual role’ does not depend ultimately on the validity of the variants of role theory (cf. Whitam and Mathy 1986; Stein 1992). The use of the word ‘role’ was seen by McIntosh (1968) as a form of shorthand, referring not only to a cultural conception or a set of ideas but also to a complex of institutional arrangements which depended on and reinforced these ideas. Its real importance as a concept is that it defined an issue that required exploration. Terms such as constructionism 14000
and roles are in the end no more than heuristic devices to identify and understand a problem in studying sexuality in general and homosexuality in particular. It is transparently obvious that the forms of behavior, identity, institutional arrangements, regulation, beliefs, ideologies, even the various definitions of the ‘sexual,’ vary enormously through time and across cultures and subcultures. A major objective of historical and social constructionist studies of the erotic has been to problematize the taken for granted, to denaturalize sexuality in order to understand its human dimensions and the coils of power in which it is entwined, how it is shaped in and by historical forces and events. The historicization of the idea of the homosexual condition is an excellent pioneering example of this. The third point that requires underlining is that, regardless of evidence for the contingency of sexual identities, this should not imply that personal sexual identities, once acquired, can readily be sloughed off. The fact that categories and social identities are shaped in history does not in any way undermine the fact that they are fully lived as real. The complex relationship between societal categorization and the formation of subjectivities and sexual identities has in fact been the key focus of writing about homosexuality since the mid-1970s. On the one hand, there is a need to understand the classifying and categorizing processes which have shaped our concepts of homosexuality— the law, medicine, religion, patterns of stigmatization, formal and informal patterns of social regulation. On the other, it is necessary to understand the level of individual and collective reception of, and battle with, these classifications and categorizations. The best historical work has attempted to hold these two levels together, avoiding both sociological determinism (you are what society dictates) or extreme voluntarism (you can be anything you want): neither is true (see discussion in Vance 1989). Some of the most interesting work has attempted to explore the subcultures, networks, urban spaces, or even rural idylls that provided the space, the conditions of possibility, for the emergence of distinctive homosexual identities. McIntosh’s suggestion that the late seventeenth century saw the emergence of a subcultural context for a distinctive homosexual role in England has been enormously influential. Her rediscovery of the London mollies’ clubs has been the starting point of numerous historical excavations (e.g., Trumbach 1977; Bray 1982). There is now plentiful work which attempts to show that subcultures and identities existed before the late seventeenth century, for example, in the early Christian world (Boswell 1980), or in other parts of Europe (see essays in Herdt 1994), just as there have been scholars who have argued that we cannot really talk about homosexual identities until the late nineteenth, or even midtwentieth centuries (see essays in Plummer 1981). There is a real historical debate. As a result, it would
Sexual Orientation: Historical and Social Construction now seem remarkable to discuss sexual identities (and their complex relationship to social categorizations) without a sense of their historical and social context. Sexual identities are made in history, not in nature.
3. Heterosexuality and Homosexuality Identities are not, however, created in isolation from other social phenomena. In particular, a history of homosexual identities cannot possibly be a history of a single homogeneous entity, because the very notion of homosexuality is dependent, at the very least, on the existence of a concept of heterosexuality, which in turn presupposes a binary notion of gender. Only if there is a sharply demarcated difference between men and women does it become meaningful to distinguish same sex from other sex relationships. Social constructionism, Vance (1989) noted, had paid little attention to the construction of heterosexuality. But without a wider sense that ‘the heterosexual’ was also a social construction, attempts to explain the invention of ‘the homosexual’ made little sense. One of the early attractions of the first volume of Foucault’s The History of Sexuality was precisely that it both offered an account of the birth of the modern homosexual, and put that into a broader historical framework: by postulating the invention of sexuality as a category in western thought, and in delineating the shifting relationships between men and women, adults and children, the normal and the perverse, as constituent elements in this process. Foucault himself was criticized for putting insufficient emphasis on the gendered nature of this process, but this was more than compensated for by the developing feminist critique of ‘the heterosexual institution,’ with its own complex history (MacKinnon 1987; Richardson 1996). Central to these debates was the perception that sexuality in general is not a domain of easy pluralism, where homosexuality and heterosexuality sit easily side by side. It is structured in dominance, with heterosexuality privileged, and that privilege is essentially male oriented. Homosexuality is constructed as a subordinate formation within the ‘heterosexual continuum,’ with male and female homosexuality having a different relationship to the dominant forms. In turn, once this is recognized, it becomes both possible and necessary to explore the socially constructed patterns of femininity and masculinity (Connell 1995). Although the constructionist debates began within the disciplines of sociology and history, later developments, taking forward both theoretical and political (especially feminist) interventions, owed a great deal to postructuralist and deconstructionist literary studies, and to the emergence of ‘queer studies.’ Whereas history and sociology characteristically have attempted to produce order and pattern out of the chaos of events, the main feature of these approaches is to show the binary conflicts reflected in literary texts.
The texts are read as sites of gender and sexual contestation and, therefore, of power and resistance (Sedgwick 1990). Sedgwick’s work, and that of the American philosopher, Butler (1990) were in part attempts to move away from the essentialist\constructionist binaries by emphasizing the ‘performative’ nature of sex and gender. This in turn opened up what might be called the ‘queer moment’ that radically challenged the relevance of fixed sexual categorizations. For queer theorists, the perverse is the worm at the center of the normal, giving rise to sexual and cultural dissidence and a transgressive ethic, which constantly works to unsettle binarism and to suggest alternatives.
4.
Comparatie Perspecties
Much of the debate about the homosexual\heterosexual binary divide was based on the perceived Western experience, and was located in some sense of a historical development. Yet from the beginning, comparisons with non-Western sexual patterns were central to constructionist perspectives. Foucault (1976\1979) compared the Western ‘science of sex’ with the non-Western ‘erotic arts.’ It was the very fact of different patterns of ‘institutionalised homosexuality,’ that formed the starting point of McIntosh’s essay, where she had identified two key regulating elements: between adults and nonadults (as in the intergenerational sex which denoted the passage from childhood to adulthood in some tribal and premodern societies), and between the genders (as in the case of the native American ‘berdache’). For some writers, these patterns are the keys to understanding homosexuality in premodern times. Historians have traced the evolution over time of patterns of homosexual life which have shifted from intergenerational ordering, through categorization around gender and class, to recognizably modern forms of egalitarian relationships (for a historical perspective see Trumbach 1998). So it is not surprising that constructionist approaches have led to an efflorescence of studies of sexuality in general, and homosexuality in particular, in other cultures, tribal, Islamic, southern (Herdt 1994). This comparative framework increasingly has been deployed within contemporary western societies to highlight the difficulty of subsuming behavior within a confining definition of condition or fixed orientation.
5. Beyond Constructionism Historical and social constructionism has advanced and changed rapidly since the 1970s. The ‘category’ that early scholars were anxious to deconstruct has become ‘categories’ which proliferate in contemporary societies. ‘Roles,’ neat slots into which people could be expected to fit as a response to the bidding of the agents of social control, have become ‘performances’ 14001
Sexual Orientation: Historical and Social Construction (Butler 1990) or ‘necessary fictions’ (Weeks 1995), whose contingencies demand exploration. ‘Identities,’ which once seemed categoric, are now seen as fluid, relational, hybrid: people are not quite today what they were yesterday, or will be tomorrow. Identities have come to be seen as built around personal ‘narratives,’ stories people tell each other in the various interpretive communities to which they belong (Plummer 1995). Individual identities, it is increasingly recognized, are negotiated in the ever-changing relationship between self and other, within rapidly changing social structures and meanings. Sexual orientation may, or may not, be a product of genetics, psychosocial structuring, or environmental pressures. That issue, which has tortured sexology for over a century, may or may not be resolved at some stage in the future. For the constructionist, however, other questions are central: not what causes the variety of sexual desires, ‘preferences’ or orientations that have existed in various societies at different times, but how societies shape meanings around sexual diversity, and the effects these have on individual lives. See also: Feminist Theory: Radical Lesbian; Gay, Lesbian, and Bisexual Youth; Gay\Lesbian Movements; Gender Differences in Personality and Social Behavior; Gender Ideology: Cross-cultural Aspects; Heterosexism and Homophobia; Masculinities and Femininities; Queer Theory; Sex-role Development and Education; Sexual Attitudes and Behavior; Sexual Behavior: Sociological Perspective; Sexual Orientation and the Law; Sexual Orientation: Biological Influences; Sexual Preference: Genetic Aspects; Sexuality and Gender; Sexuality and Geography
Kinsey A C, Pomeroy W B, Martin C E 1948 Sexual Behaior in the Human Male. Saunders, Philadelphia McIntosh M 1968 The homosexual role. Social Problems 16(2): 182–92 MacKinnon C A 1987 Feminism Unmodified: Discourses on Life and Law. Harvard University Press, Cambridge, MA Plummer K 1975 Sexual Stigma: An Interactionist Account. Routledge and Kegan Paul, London Plummer K 1995 Telling Sexual Stories: Power, Change and Social Worlds. Routledge, London Richardson D (ed.) 1996 Theorising Heterosexuality: Telling it Straight. Open University Press, Buckingham, UK Sedgwick E K 1990 Epistemology of the Closet. University of California Press, Berkeley, CA Stein E (ed.) 1992 Forms of Desire: Sexual Orientation and the Social Constructionist Controersy. Routledge, New York Trumbach R 1977 London’s sodomites: Homosexual behavior and Western culture in the 18th century. Journal of Social History 11(1): 1–33 Trumbach R 1998 Sex and the Gender Reolution, Heterosexuality and the Third Gender in Enlightenment London. Chicago University Press, Chicago, Vol. 1 Vance C S 1989 Social Construction Theory: Problems in the History of Sexuality. In: Altman D, Vance C, Vicinus M, Weeks J et al. (eds.) Homosexuality, Which Homosexuality? GMP Publishers, London, pp. 13–34 Weeks J 1977 Coming Out: Homosexual Politics in Britain from the Nineteenth Century to the Present. Quartet Books, London Weeks J 1995 Inented Moralities: Sexual Values in an Age of Uncertainty. Columbia University Press, New York Whitam F L, Mathy R M 1986 Male Homosexuality in Four Societies. Praeger, New York
J. Weeks
Sexual Perversions (Paraphilias) Bibliography Altman D, Vance C, Vicinus M, Weeks J et al. 1989 Homosexuality, Which Homosexuality? GMP Publishers, London Boswell J 1980 Christianity, Social Tolerance, and Homosexuality: Gay People in Western Europe from the Beginning of the Christian Era to the Fourteenth Century. University of Chicago Press, Chicago Bray A 1982 Homosexuality in Renaissance England. GMP Publishers, London Butler J 1990 Gender Trouble: Feminism and the Subersion of Identity. Routledge, New York Connell R W 1995 Masculinities. University of California Press, Berkeley, CA Foucault M 1976\1979 Histoire de la sexualite, 1, La Volunte de Saoir. Gallimard, Paris [trans. 1979 as The History of Sexuality, Vol. 1, An Introduction. Allen Lane, London] Gagnon J H, Simon W 1973 Sexual Conduct: The Social Sources of Human Sexuality. Aldine, Chicago Greenberg D E 1988 The Construction of Homosexuality. University of Chicago Press, Chicago Halperin D 1995 Saint Foucault: Towards a Gay Hagiography. Oxford University Press, New York Herdt G (ed.) 1994 Third Sex, Third Gender: Beyond Sexual Dimorphism in Culture and History. Zone Books, New York
14002
1. Definition Modern usage tends to favor the word ‘paraphilia’ rather than ‘deviance’ or ‘perversion.’ In this article these latter terms will be used only in their historical context. The word ‘paraphilia’ means a love of ( philia) the beyond or irregular ( para), and is used instead of those words which today have pejorative implications that are not always relevant. The term itself is used to describe people, usually men, with intense sexual urges that are directed towards nonhuman objects, or the suffering or humiliation of oneself or one’s partner, or more unacceptably, towards others who are incapable of giving informed consent, such as children, animals, or unwilling adults. People who are paraphiliacs often exhibit three or four different aspects, and clinical psychiatric conditions (personality disorders or depression) may sometimes be present. Paraphilias include: exhibitionism (the exposure of ones genitals to a stranger, sometimes culminating in masturbation); voyeurism or peeping (the observance of strangers undressing or having sexual intercourse,
Sexual Perersions (Paraphilias) without their being aware of the voyeur, who usually masturbates); fetishism (the use of inanimate objects for arousal, usually articles of women’s underwear, although if these are used to cross-dress, transvestic fetishism is the diagnosis); frotteurism (the rubbing of the genitals against the buttocks, or the fondling of an unsuspecting woman, usually in a crowded situation, so that detection of the perpetrator is unlikely). Pedophilia refers to men sexually attracted to children, some to girls, some to boys and some to either sex. Some pedophiles are attracted sexually only to children, the exclusive types, but some are attracted to adults as well, the nonexclusive type. Sexual sadism and sexual masochism (S&M) involve the infliction or reception of pain respectively to necessitate arousal and orgasm. Sadistic fantasies that involve obtaining complete control of the victim are particularly dangerous and may result in death. Some masochistic behaviors that involve self-asphyxiation as part of a masturbatory ritual can result in accidental death. Other paraphilias include the use for sexual purposes of corpses (necrophilia), animals (zoophilia), feces (coprophilia), enemas (klismaphilia) and urine (urophilia). It is of particular interest that just as masturbation is nowadays excluded from being considered a deviant act, so homosexuality, which figured so largely in this area in earlier times, was removed as a paraphilia in the 1974 edition of the Diagnostic and Statistical Manual of Mental Disorders.
2. Greek Mythology The ancient Greeks often ascribed to their gods extreme forms of sexual behavior. Whether this reflected fears or wishful thinking or actual practices is obviously very difficult to evaluate. Zeus’s myriad affairs were often conducted in the form of animals or even inanimate objects, such as a cloud or a shower of gold, and his objects of desire were male or female, as indeed were those of the other gods and heroes. Artemis and Athena both eschewed sex altogether and remained adamantly virginal. The mighty Hercules cross-dressed to be humiliated by Queen Omphale, while Theseus and Achilles both donned women’s clothes without apparent later loss of esteem (Licht 1969, Bullough 1976).
3. History 3.1 The Hunter-gatherer Past In most species sexuality and reproduction are of necessity tied together and overlap, but in modern humans it is the relatively recent separation of sexuality from reproduction that represents a crucial evolutionary phase. Despite huge sociological changes we are genetically still at the stage of hunter-gatherers,
who had a shorter life, no menopause and long periods between childbirth due to prolonged breast feeding. Perhaps because of better nutrition, and possibly exposure to electric light (which affects the hypothalamus), girls in the developed countries in the 1990s have their first period at around 13 years of age, and ovulation starts soon afterwards, with pregnancy being a distinct possibility. Usually most women have conceived and had their children by 30–40 years of age, which, assuming many will reach 80 years of age, leaves some 40–50 years of life to enjoy nonreproductive sex. Sex therefore takes on new meanings and fulfils other roles, perhaps as a recreational activity, which allows time for sexual variant behavior (Short 1976). 3.2 Sexuality Numerous so-called Venus figurines and cave paintings have been found from between 25,000 and 10,000 years ago that depict almost all of the sexual acts with which we are familiar today. What the meanings and significance of these depictions were to the people who created them we can only speculate. Early studies showed that paraphilias were present universally in every culture and throughout every historical period. Often tied up with religious rites, sacred prostitution, and\or phallic cults, public shows of self-immolation, sadism, fetishism, and homosexuality flourished. Iwan Bloch believed that every sensory organ could function as an erotogenous zone, and so form the basis for a so-called perversion. Freud pointed out that the ancients laid stress on the instinct itself, whereas we today are preoccupied with its object. Thus it was considered wrong to be passive, that is, to be penetrated anally by one’s slave, but acceptable to penetrate him or her. Today the object, that is, whom a man penetrates, is important.
3.3 Classifications The enormous variations found in human sexual behavior have always been part of the heritage of humankind, but it was not until the eighteenth century in the West that such activities were labeled and put into categories, so that attempts at a scientific classification could be made. These categories varied from culture to culture and showed a great deal of fluidity, as each culture constructed the behavior in different ways. The essential behaviors remained constant, however, and it was rather how each society considered them, whether they approved or disapproved, that created the particular sexual climate. Classifications thus offer an insight into the current thinking of each particular time. The Marquis de Sade (1740–1814), while imprisoned in the Bastille, wrote a full description of most perversions and suggested a classification about 14003
Sexual Perersions (Paraphilias) 100 years before his successors, Krafft-Ebing, Havelock Ellis, and Freud. He wrote many short stories on sexual themes including incest and homosexuality. He also foresaw the women’s rights issue, which he treated in a compassionate and modern way, yet this did not preclude his use of women as victims in his writings. His 120 Days of Sodom is perhaps his main work on perversion, although all his works contain little homilies about sex and morals. As de Sade lived in an age that was both turbulent and brutal, it says something for the humanity of the man that he implored us not to laugh at or sneer at those with deviant impulses, but rather to look upon them as one might a cripple, with pity and understanding. De Sade offers the following classification for sexuality (Gorer 1962): (a) People with weak or repressed sexual desires— the nondescript majority, he saw as cannon fodder for the next two groups; (b) Natural perverts (born so); (c) Libertines—people who imitate group (b), but wilfully rather than innately. De Sade’s admiration was for group (b). The 120 Days of Sodom, written while Sade was in the Bastille, contains examples of all these, and the range extends from foot fetishism, various types of voyeurism, obsessional rituals and bestiality, to frank sadism and murder. Sade described many sexual acts involved with body fluids, sperm, blood, urine, saliva, and feces. Jeremy Bentham (1748–1832), the utilitarian philosopher who believed that nature has placed humankind under the governance of two sovereign masters, pain and pleasure, published his essay on pederasty in 1785 in which he argued cogently for the decriminalization of sodomy (then punishable by hanging), and other sexual acts. He challenged the notion current at the time that such behaviors weaken men, threaten marriage, and diminish the population. He went on to discuss these punishments as an irrational antipathy to pleasure, and he highlighted the dangers that prohibitions incur, i.e., possible blackmail and false accusations (Crompton 1978). In the nineteenth century, Darwin in his theory of evolution was able to explain the presence of gill slits and the tail present in the early human embryo as well as the appendix, as evidence of our evolutionary past. Ernst Haeckel, Darwin’s forceful adherent, developed the concept that ontogeny recapitulated phylogeny (see Eolution, Natural and Social: Philosophical Aspects). It was argued from this that if elements from our past physical evolution were so preserved, why should this not apply to our psychological history? Thus sexual perversions were deemed to reflect a return to an earlier developmental stage in phylogeny that had somehow become fixated, so that perverts, like other races and women, were at a lower point on the evolutionary ladder. Following on from Darwin’s observations of hermaphroditism in nature, Karl A. Ulrichs (whose pseudonym was Numa Numantius) 14004
believed male homosexuals to be female souls in men’s bodies. Succeeding generations of doctors and sexologists argued these points and many offered alternative classifications of the perversions to reflect their views. These are mainly of historical value now, but perhaps we could look at Krafft-Ebing’s classification as an example. He believed that life was a struggle between instinct and civilization, and that mental illness created no new instincts, but only altered those that already existed. He believed that hereditary taint and excessive masturbation were among the causes of sexual perversion. His classification, which is self-explanatory, was as follows: Group 1: deviation of the sex impulse. Too little sexual feeling (impotence, frigidity); too much (satyriasis and nymphomania), and sexual feeling appearing at the wrong time (in childhood or old age) or being directed wrongly (at children, the elderly or animals). Group 2: Sexual release through other forms of activity (inflicting or receiving pain or sexual bondage). Group 3: Inverted sexuality (homosexuality, bisexuality, and transvestism). Krafft Ebing’s Psychopathia Sexualis ran into 12 editions in his lifetime, with many revisions, and although he did write some of the more explicit details in Latin, the British Medical Journal in 1893 lamented that the entire book had not been written in Latin, ‘in the decent obscurity of a dead language.’ Like Sade before him, Krafft Ebing spoke of ‘perverts’ as ‘stepchildren of nature,’ and asked for a more medical and sympathetic approach rather that mere legal strictures. His belief that most perversions were mainly congenital was challenged by Alfred Binet who argued that with fetishism (a term he coined), for example, the fetish may take many different forms, for example, an obsession with the color of the eyes, bodily odors, various types of dress, or different inanimate objects. Heredity could not dictate this choice, rather a chance phenomena that occurred in childhood was more likely as a cause. Furthermore, on this argument, the fetishist could just as well become a sadist, masochist, or homosexual if early childhood events had been so conducive. As many individuals were exposed to such childhood events and did not develop a perversion, however, Binet had to conclude that there could well be a congenital morbid state in those who did so. Albert Moll argued further that as all biological organs and functions were subject to variations and anomalies, why should the sexual behaviors be any different? Moll further believed that the sense of smell was always an important factor in mammalian sexuality, and the advent of clothing in human culture diminished this. Wilhelm Fliess, an ENT surgeon and colleague of Freud, drew attention to the erectile
Sexual Perersions (Paraphilias) tissues of the nose and its similarity to that of the penis and clitoris. The upright posture adopted by humans had distanced them from the ground, away from feces and smell, so diminishing the importance of olfaction in human arousal, and forming what Freud described as an abandoned erotogenous zone (Sulloway 1992). Today these mechanisms have been somewhat elaborated. A region in the nasal septum (Jacobson’s organ), is linked by nerves to the hypothalamus, a region of the brain that controls sex-hormone secretion. Small volatile chemicals known as pheromones, found in sweat around the genital and axillary regions, are known to stimulate this nasal pathway. Subtle differences in the pheromone balance depend on the individual’s genetic make-up, and it is possible that these differences give a biological basis for avoidance of attraction to one’s near relatives who exhibit similar profiles. The importance of smell in animal sexuality is well known, and it is of interest that to many fetishists it is soiled rather than clean female underwear that arouses them (Goodman 1998). Freud believed that each individual from infancy to adulthood repeated the moral development of the race, from sexual promiscuity, to perversion, and then on to heterosexual monogamy. There was a correct developmental path, during which the infant, from the stage of polymorphous perversity, went through various phases of development, and negotiated the Oedipal situation and castration complex. Freud believed that perversions arose because of arrested development leading to fixations on this pathway due to sexual trauma. Those who did not deviate when so exposed in childhood, Freud believed to have been protected by constitutional factors, namely an innate sense of propriety that had been acquired through moral evolution. He classified as deviants those who have different sexual aims (from normal vaginal intercourse), for example, sadists, fetishishists, and those who have a different love object (from the normal heterosexual partner), such as pedophiles or homosexuals (Weeks 1985) (see Psychoanalysis: Current Status).
4. Modern Times At the beginning of the twenty-first century, technology-led cultures have had a profound effect on sexual behavior, which has expanded to fill the different niches in the manner of Darwin’s ‘tangled bank.’ We may see even greater fluidity of such behaviors in the future. Just two areas will be briefly considered. 4.1 Fetishism and Fashion Today the fashion industry makes much use of fetishism in its designs. Magazines cater for a whole range of people, both gay and straight, with fetishistic and S&M interests. Relatively new phenomena have
appeared in recent times, such as ‘vogueing’ where the devotee dresses up as a facsimile body of an admired personality either of the same sex (homovestism), or of the opposite sex (the well-known transvestism). Common examples are Elvis Presley and Madonna lookalikes. Voguers compete with each other for realness. Fans who obtain objects, such as articles of clothing or signed photographs from their idols, or who in the USA search the dustbins of stars for trinkets (so-called ‘trashcanners’) and later use these in their sexual fantasies, resemble classical fetishists in their behavior (Gammon and Makinen 1994, Goodman 1998). 4.2 Technology and Sexuality Every technological invention has altered society in some way and this applies equally to the field of sexual behavior. Newspapers allow wide advertisement of sexual services, the telephone made possible obscene telephone calls, while commercial sex lines have utilized this phenomenon so that men (it is mainly men who use them) can dial and indulge in sexual talk for the price of the call. The automobile is often used as a place of sexual assignation, the home video for viewing obscene material, and the camcorder for making erotic home movies. The term ‘cybersex’ is used for people who wish to indulge in erotic fantasies through the Internet. The Internet provides a means of getting in touch with other like-minded individuals worldwide, so that fantasies, which may previously have existed only in the mind of one individual, can be exchanged, enhanced, and embellished. Arrangements have been made for such people to meet in order to carry out acts, which have included pedophilia, rape, and even murder. The perpetrator(s) may take on a different age or gender persona (‘gender bending’), to entrap the unwary, especially children. As a result of this, groups have been set up to combat these problems, for example, the Cyberangels in the UK, who monitor the net and inform the police where necessary. Psychologically some individuals in tedious relationships may find that occasional looks at erotica enlivens their sex lives, but others have become obsessed and addicted to on-line sex (‘hot chatting’), and indeed have needed therapy to help them cope (Durkin and Bryant 1995). Virtual reality, which offers the possibility of taking part in a virtual sexual scenario, will further increase the scope of sexual variant behavior.
5. Recent Scientific Adances in the Understanding of the Paraphilias 5.1 The Brain The human brain reached its present size and proportions about 50,000–100,000 years ago. The brain 14005
Sexual Perersions (Paraphilias) consists of regions which formed at different epochs in vertebrate evolution. It is the hypothalamus that is largely concerned with the hormonal control of reproduction. Certain nuclei found here in homosexual menhavebeenclaimedtobemorefemalethanmale-like, and similar findings have been made in male to female transsexuals (Swaab et al. 1997). Sexual fantasy depends on the cerebral cortex, which is of such complexity that an almost infinite variety of mental responses are possible which ensures a unique plasticity to the range of sexual behaviors. Furthermore, following head injury or the use of certain drugs, paraphilic behavior may become evident in individuals who did not show such tendencies before. Medical conditions such as temporal lobe epilepsy have been associated with fetishism in some patients. New noninvasive techniques for studying the brain, such as functional magnetic resonance imaging and positron emission tomography are helping to elucidate its functions.
are thought to influence the developing fetus. This has been shown in mice and other species. When a male fetus is placed in between two female fetuses leakage of female hormones may feminize the male. Testosterone can masculinize the female if a female is between two males (Vom Saal and Bronson 1980). Stress applied to pregnant rats just one week before delivery resulted in homosexual and bisexual behaviors in males (Ward and Weisz 1980) and Dorner et al. (1983) thought that the stress of war in pregnant women resulted in a higher incidence of male homosexuality in the German population. Sex hormones, particularly testosterone, are known to interact with the immune system. It has been suggested that the mother’s immune system triggered by a male fetus can affect male psychosexual development, especially if she has had many pregnancies. This would further inhibit her immune system and may explain the preponderance of elder brothers in the families of homosexuals and certain pedophiles (Blanchard and Bogaert 1996) (see Homosexuality and Psychiatry).
5.2 Genes and the Deeloping Fetus The contributions of genes and development to adult sexual behavior have been a topic of intense debate for many years. Recently, new discoveries have added fuel to this argument, and new concepts have been considered. In humans the Y-chromosome slows down the growth rate in the male fetus, which is consequently born relatively more immature compared with the female and is therefore more vulnerable. Thus males have a higher perinatal mortality and a higher incidence of accidental death, and later in life they show an increased vulnerability to cardiac disease and certain forms of cancer. Mental handicap, which includes autism and epilepsy, is more common in men than women and perhaps the preponderance of the paraphilias in males is also a reflection of this vulnerability (Ounsted and Taylor 1972). Do genes play some part in determining human sexual behaviors and orientation? Certainly in Drosophila, the fruit fly, where male and female behavior were once considered separate and exclusive, recent experimental manipulations of various genes have produced bisexual and homosexual behaviors in males. Even courtship chains, both heterosexual and homosexual (like something out of the Marquis de Sade), have been seen, behaviors that never occur in the wild (Yamamoto et al. 1996). In humans, some paraphiliac behavior does seem to run in families (Gorman 1964, Cryan et al. 1992), and suggestions of a gene occurring on the X-chromosome that predisposes to male homosexuality have been made (Hamer and Copeland 1994). Developmental factors are also of seeming importance in future behaviors and it is the presence of the two hormones, testosterone, and estrogen, that 14006
6. The Future Freud and the sexologists considered the libido in terms of the combustion engine and electricity, as well as using biological ideas that were extant at the time. Today as knowledge in all fields converges, concepts from one area are fertilized as never before by ideas from others. Cybernetics, the study of control and communication in artificial neural networks, has been applied to biological systems such as the brain. Chaos theory, a concept derived from nonlinear dynamics and used initially to predict the weather, has been applied to numerous other areas of research which include fetal development and sexual behavior, as well as various branches of psychology (Goodman 1997). Waddington (1975) portrayed fetal development as a mountainous terrain, in which the fetus is the ball that rolls down the valleys, which depict the possible developmental pathways. These pathways represent the culmination of millions of years of evolution and are relatively resistant to change. Both genetic factors and early environmental stresses may divert the fetus onto another pathway however, although the system does have a degree of stability. It may well be that certain paraphiliacs and individuals with homosexual or transsexual identities have been diverted from the more common (but not necessarily more normal) path of heterosexual identity. If sexual orientation and behavior are linked to cognition, as seems to be true, then variations could have evolutionary possibilities by throwing up individuals who have the ability to think differently from their peers, offering no little advantage in the struggle for survival. The epistemological solipsism of the developing brain needs sex for its development in the world and not just for pro-
Sexual Preference: Genetic Aspects creation (Freeman 1995). Sexual behavior and its variants therefore may merely be a reflection of this process. We should contemplate it with a sense of awe. See also: Rape and Sexual Coercion
Bibliography American Psychiatric Association 1994 Diagnostic and Statistical Manual of Mental Disorders: DSM-IV, 4th edn. American Psychiatric Association, Washington DC Blanchart R, Bogaert A F 1996 Homosexuality in men and number of older brothers. American Journal of Psychiatry 153(1): 27–31 Bullough V L 1976 Sexual Variance in Society and History. Wiley, New York Crompton L 1978 Offences against oneself, by Jeremy Bentham, Part 1 (ca. 1785). Journal of Homosexuality 3(4): 389–405 Cryan E M J, Butcher G J, Webb M G T 1992 Obsessive-compulsive disorder and paraphilia in a monozygotic twin pair. British Journal of Psychiatry 161: 694–8 Dorner G, Schenk B, Schmiedel B, Ahrens L 1983 Stressful events in prenatal life of bi- and homosexual men. Experimental Clinical Endocrinology 81: 83–7 Durkin K F, Byant C D 1995 ‘‘Log on to sex’’: Some notes on the carnal computer and erotic cyberspace as an emerging research frontier. Deiant Behaior: An Interdisciplinary Journal 16: 179–200 Freeman W J 1995 Societies of Brains. A Study in the Neuroscience of Loe and Hate. Lawrence Erlbaum, Hillsdale, NJ Gammon L, Makinen M 1994 Female Fetishism: A New Look. Lawrence & Wishart, London Goodman R E 1997 Understanding human sexuality: Specifically homosexuality and the paraphilias in terms of chaos theory and fetal development. Medical Hypotheses 48(3): 237–43 Goodman R E 1998 The paraphilias: An evolutionary and developmental perspective. In: Freeman H, Pullen I, Stein G, Wilkinson G (eds.) Seminars in Psychosexual Disorders. Gaskell, London, pp. 142–55 Gorer G 1963 The Life and Ideas of the Marquis de Sade. Peter Owen, London Gorman G F 1964 Fetishism occurring in identical twins. British Journal of Psychiatry 110: 255–6 Hamer D, Copeland P 1994 The Science of Desire: The Search for the Gay Gene and the Biology of Behaior. Simon and Schuster, New York Licht H 1969 Sexual Life in Ancient Greece. Panther Books, London Ounsted C, Taylor D C 1972 The Y-chromosome message, a point of view. In: Ounsted C, Taylor D C (eds.) Gender Differences: Their Ontogeny and Significance. Churchill Livingstone, Edinburgh, UK Short R V 1976 The evolution of human reproduction. Proceedings of the Royal Society of London, Biology 195: 3–24 Swaab D F, Zhou J N, Fodor M, Hofman M A 1997 Sexual differentiation of the human hypothalamus: Differences according to sex, sexual orientation and transsexuality. In: Ellis L, Ebertz L (eds.) Sexual Orientation. Toward Biological Understanding. Praeger, Westport, CT, pp. 129–50 Sulloway F J 1992 Freud. Biologist of the Mind. Harvard University Press, Cambridge, MA
Vom Saal F S, Bronson F H 1980 Sexual characteristics of adult female mice are correlated with their blood testosterone levels during prenatal development. Science 208: 597–9 Waddington C H 1975 The Eolution of an Eolutionist. Edinburgh University Press, Edinburgh, UK Ward I L, Weisz J 1980 Maternal stress alters plasma testosterone in foetal males. Science 207: 328–9 Weeks J 1985 Sexuality and its Discontents. Routledge and Kegan Paul, London Yamamoto D, Ito H, Fujitani K 1996 Genetic dissection of sexual orientation: Behavioral, cellular, and molecular approaches in Drosophila melanogaster. Neuroscience Research 26: 95–107
R. E. Goodman
Sexual Preference: Genetic Aspects In many animals, females preferentially mate with males that are adorned with extravagant traits like bright feathers or complicated courtship behavior. As a result such sexual preference has led to the evolution of many elaborate male signals as for example nightingale song or peacock feathers. Many male extravagant traits are thus clearly caused by female sexual preferences, but why did female preference evolve in the first place. Several different hypotheses have been proposed to explain the evolution of female choice. Here, these hypotheses and the available empirical evidence that might help to distinguish between them are discussed.
1. Sexual Selection and the Sex Roles Males and females in most animal species differ to a large extent in their morphology and their behavior. Female birds are, for example, often drab and coy compared to their colorful and sexually active mates. In many species, males compete with other males for access to females and often use specialized horns, teeth or other weapons, whereas their mates often do not fight with other females. The basic reason behind these sex differences is gamete size that often differs by several orders of magnitude between the sexes. Females produce relatively large gametes, the eggs, and accordingly can only produce relatively few of them. Males, on the other hand, produce large numbers of tiny sperm cells. Due to this difference in the number of gametes available, female reproductive success usually is limited by the number of eggs produced, and multiple matings by females have at most only a small effect on female reproductive success. Male reproductive success, however, is mainly limited by the number of eggs fertilized and thus by the number of mates obtained and not so much by sperm 14007
Sexual Preference: Genetic Aspects number (Clutton-Brock and Vincent 1991). Males therefore can benefit from attractiveness and from competition for mating partners whereas females probably benefit more by choosing the best ones among the many available mates. In accordance with this expectation, females often seem to carefully examine the available males and reject mating attempts by nonpreferred males (Andersson 1994). In peacocks, for example, hens prefer males with large colorful feathers that are distributed symmetrically, and in some other birds, females prefer males with large song repertoires. The few species where males heavily invest in offspring, e.g., some katydids with large parental investment, as well as seahorses and wading birds with exclusive paternal care, confirm the rule. In these species females usually compete for mates and males choose among females because males are limited by the number of offspring they can care for and females are limited by the number of males they can obtain to care for their offspring. These studies on sex role reversed species show that the sex difference in the benefit of additional matings after the first one is the driving force behind sexual selection. With standard sex roles, male traits that impair survival but render males especially attractive to females can evolve since male traits that are preferred by females will cause an increased mating frequency. Due to the effect of mating frequency on male reproductive success, these traits can lead to an increased lifetime mating success even when male survival decreases. The splendid feathers of male peacocks for example probably decrease male survival because this shiny ornament will also attract predators and make escape less efficient. Less adorned males may live longer. However, when they leave, on average, fewer offspring than attractive males, genes for being less adorned will go extinct. Compared to the evolution of male traits under the influence of female choice, the evolution of female preference is less easy to explain. In the next section the possible routes for the evolution of female preference are described.
2. Theoretical Models for the Eolution of Preferences Several hypothesis have been presented to explain the evolution of female preference that causes females to choose their mating partners among the available males (see Kirkpatrick and Ryan 1991, Maynard Smith 1991, for reviews). The simplest possibility would be that males differ in the amount of resources provided or in their fertilization rate. According to this hypothesis, choosy females increase the number of offspring produced and they thus benefit directly from their choice. In some birds, for example, those males are preferred as mates that provide superior territories and outstanding paternal care. Another possible explanation for the evolution of female preferences is 14008
that choice has an influence on the genetic quality of the offspring. According to this hypothesis females benefit indirectly when offspring of preferred males have superior viability or increased mating success compared to the offspring of nonpreferred males. And finally, female preference might have evolved in another context than sexual selection, that is, female birds might prefer males with blue feathers because it is adaptive to react positively to blue when blue berries constitute a valuable resource so that female sensory physiology became tuned to this color. This model has accordingly been termed sensory exploitation hypothesis, meaning that males exploit female sensory physiology to attain attractiveness (Ryan and KeddyHector 1992). For all of these models, genetic variation in preference is essential since evolution of all traits rests on the existence of genetic variance. Despite the importance of such data, not very many species have been examined for genetic variance in female preference. In most of the examined species, ranging from insects to mice, within population variation in female preference seems generally to be influenced by additive genetic variance (Bakker and Pomiankowski 1995). Such genetic variance in preference means that this trait can easily evolve whenever there is a selective advantage to a specific preference. In the following paragraph the different hypotheses for the evolution of female preferences will be discussed in more detail and the genetic aspects of these models will be concentrated on. When females benefit directly from their choice by increased lifetime fecundity, it is easy to see that any new choice will evolve as long as the extra benefit gained is larger than the extra cost paid for female choice. For example, females might prefer healthy mates because this might reduce the risk of infection during mating. Choosing males with intact and shiny feathers might thus reduce the risk of ectoparasite transfer during copulation. For models that rest on indirect benefits of female preference, females only gain from choice regarding the genetic quality of their offspring. For these indirect benefit models of female preference, the male trait also needs to have genetic variance. One of these hypotheses, Fisher’s arbitrary sexual selection model, predicts that female choice should evolve in response to the evolution of male traits. If females prefer males with a specific trait size, a linkage disequilibrium (a nonrandom association of genes) will build up, since the genes for the preference and the genes for the male trait will co-occur together more frequently than by chance. The existence of choosy females will, in addition, cause increased reproductive success of these preferred males. Since these males also carry a disproportional share of the preference allele, due to the linkage disequilibrium that is caused by the preference, the preference evolves in response to the evolution of the male trait (Kirkpatrick and Ryan 1991). If sexual selection is strong and if preference and male trait are heritable, both the
Sexual Preference: Genetic Aspects male trait and the female preference are predicted to evolve by a positive feedback that eventually might lead to runaway selection. The distinctive and sufficient genetic condition for Fisher’s arbitrary sexual selection model thus is a genetic correlation between female preference and male traits. According to the good genes hypothesis, another hypothesis that suggests indirect benefits of female preferences, choosy females benefit not only by producing preferred sons but also by producing offspring with superior viability. For such a process to work, male attractiveness needs to indicate male viability. Speaking in genetic terminology, there has to be a genetic correlation between male signaling traits and viability. Since a genetic correlation between female preference and male signaling traits will also build up as a result under the good genes model, such a correlation thus cannot be taken as evidence for the arbitrary sexual selection model but it means that one of the indirect benefit models helps to maintain the preference. To provide evidence that the sensory exploitation model helps to explain the evolution of female preference, one has to show that female preference evolved before the male trait and thus independent of mate choice. The usual way is to show that female preference is more ancestral in phylogeny than the male trait. One of the best studied examples of this hypothesis is the tungara frog, Physalaemus pustulosus, where females have a preference for a male acoustic signal that is not produced by the males, but males of a closely related species do produce it. The suggested explanation for this pattern is that female choice evolved first in the ancestor of both species and the male trait evolved later in only one of those species. However, the loss of the attractive male signal during phylogeny cannot be excluded as an alternative explanation.
3. Testing Genetic Aspects of Sexual Selection Models The good genes hypothesis predicts that females benefit from choosing attractive males because these males will produce offspring with superior viability. The critical problem for this hypothesis is that attractive males need to have higher viability. Why should such a correlation between attractiveness and viability exist? The most prominent explanation is the following: Attractive signals are costly and only males with superior viability are able to afford these costs so that only those males can attain high attractiveness. This view of condition-dependent signaling is supported by empirical data showing that in various insects, spiders, and frogs the production of attractive courtship signals consumes more energy than the production of less attractive signals. Also, males that suffer from disease or are starving are usually not as attractive as healthy competitors. Such a process of condition-dependent signaling leads to a genetic cor-
relation between male signaling traits and male viability when there is genetic variation for both traits. If females prefer males with signaling traits that are correlated with viability, female choice can evolve since it will become associated with high viability and a rare female choice allele can thus be expected to increase in frequency when the costs of female choice are low. If females choose among males on the basis of a male trait, a genetic correlation will build up between male trait and female preference at the same time. Since this prediction is identical to the critical and sufficient condition for the arbitrary sexual selection model, experimental separation of these two nonexclusive hypotheses is difficult. The most frequent method to examine whether the good genes hypothesis contributes to maintain female preference is to compare the viability of the offspring of preferred males with the viability of average or nonpreferred males. In some studies using this method, female choice seems to have an astonishingly large effect on offspring viability, in other studies, no significant effect was observed (see Møller and Alatalo 1999 for a review). Theoretical models show that these indirect sexual selection benefits are unlikely to increase offspring fitness by more than a few percent (Kirkpatrick 1996). Since direct benefits are suggested to incur larger benefits, evolution of female preferences is predicted to be dominated by direct benefits if these can be obtained by female choice. However, even a small indirect benefit to sexual preferences may be large enough for quick evolution when choice is not very costly. With an exhaustive literature review Møller and Alatalo (1999) showed that a significant positive correlation coefficient between male viability and attractiveness exists on average and they therefore proposed that females can gain a few percent regarding the viability of their offspring when they carefully choose among the potential mates. In general, the evolution and maintenance of sexual preference seems to be due to various factors. Empirical evidence exists in favor of each of the most prominent hypotheses—direct benefits, sensory exploitation, Fisher’s arbitrary sexual selection and good gene models. These hypotheses are not mutually exclusive and some of them might work synergistically. In the future, more studies examining the quantitative genetics of sexually selected traits are necessary to evaluate the importance of the different hypotheses (Bakker 1999).
4. The Maintenance of Genetic Variation in Male Traits 4.1 Theoretical Expectation Both indirect sexual selection models depend on the existence of genetic variance in male traits and will in 14009
Sexual Preference: Genetic Aspects turn also cause strong selection on these traits. For theoretical reasons it has often been argued that genetic variance of traits under strong selection will usually be smaller than the genetic variance of traits that are less strongly selected (Fisher 1930). This difference occurs because under strong selection, all but the most beneficial alleles will disappear quickly from a population. In line with these arguments, life history traits that are closely connected to fitness have a lower additive genetic variance than morphological traits that are believed to be under weaker selection. If strong selection would deplete genetic variance in male attractiveness or viability, the benefit of female preference would decrease and if there are costs to female choice, females would accordingly benefit from refraining from choice and saving these costs. However, females seem to choose strongly among males in most animal species (Andersson 1994). Furthermore, female choice and the effects of sexual selection on male traits are most obvious in lekking species, where males court females at courtship territories and do not provide any resources for the females. In these cases, direct benefits are unlikely and it is not easy to see why sensory exploitation should be more frequent in these species. The indirect benefit models are thus the only ones that are likely to explain why females strongly select among the available males in lekking species and elsewhere when males do not provide resources. However, when females exert sexual selection on male viability or attractiveness, the genetic variance in male quality should decrease and the benefit of the preference will also decrease in response. The important question thus is whether male genetic variance is large enough to counterbalance the costs of being choosy.
4.2 Empirical Eidence for Genetic Variance among Males There is ample evidence for genetic variance in male signaling traits and in attractiveness: crosses between populations that differ in the male signaling trait show that the difference is inherited; heritability estimates from father-son comparisons show that genetic variation exists even within populations and artificial selection experiments have proven that sexually selected male traits can evolve quickly. Despite strong sexual selection on male signaling traits, sexually selected traits generally seem to have heritabilities that are as large as the values for traits that are assumed to be only under weak selection (Pomiankowski and Møller 1995). The deviation from the theoretical expectation is even more impressive when one compares the additive genetic variance (another measure for genetic variance that does not depend on the extent of environmental influence on the trait under consideration) between sexually selected traits and traits that are assumed to be not under sexual selection. Sexually selected traits have significantly larger ad14010
ditive genetic variance than other traits, showing that the existing genetic variation is sufficient for both indirect sexual selection models.
4.3 Possible Reasons for the Maintenance of Genetic Variance The extent of genetic variation present in natural populations thus seems to contradict the theoretical expectations. It is, therefore, important to understand how extensive genetic variance can be maintained in the face of strong sexual selection. Among the hypotheses put forward to explain these data that seem to contradict the theory, the following three causes for the maintenance of genetic variance shall be discussed in some detail because they have received considerable credit: capture of genetic variance in condition, hostparasite coevolution, and meiotic drive. Male signals are only likely to honestly indicate male viability, when attractive signals are costly because otherwise males with low viability will also be able to produce these attractive signals. When signals are costly, only males in good condition will be able to produce attractive signals (Zahavi 1975). Due to this process, male signal quality will indicate a male’s condition. Since male condition is assumed to possess large genetic variance because many traits will influence condition, male signals will capture genetic variance in condition and can in this way maintain the genetic variance in male viability necessary for the good genes model (Rowe and Houle 1996). Another hypothesis suggests that host-parasite coevolution is important in maintaining genetic variance in male signaling traits (Hamilton and Zuk 1982, Westneat and Birkhead 1998). Let us assume there are male traits that indicate that a male is free of parasites or resistant to parasites. If such resistance is heritable, females would clearly benefit from choosing those males and female preference could accordingly evolve. With increasing resistance, selection on the parasite would lead to new types of parasites that in turn will cause the selection of new resistance. This arms race can in principle lead to a persistent and large advantage to female preference for resistant males because genetic variance in males is maintained since superior male genotypes change with time. There is some indication that such a process also works in humans, where scents from potential partners with dissimilar MHC-genes (an immunologically important group of genes) are preferred ( Wedekind and Furi 1997). Preference of females for males resistant to the action of meiotic drive has also recently been proposed as a scenario that allows persistent benefits of female choice. In stalk-eyed flies, females prefer males with longer eyestalks, and lines selected for longer eyestalks show increased resistance to meiotic drive (Wilkinson et al. 1998). It was, therefore, suggested that females
Sexual Preference: Genetic Aspects benefit from choosing males with large eyestalks because resistance to meiotic drive is more frequent in these males. Theoretical simulations, however, revealed that the predicted process cannot occur, but they showed that the avoidance of males possessing meiotic drive does allow persistent benefits for female preference (Reinhold et al. 1999).
5. Genetics of Sexually Selected Traits: Xchromosomal Bias Based on reciprocal crosses between two Drosophila species, Ewing (1969) long ago proposed that a disproportional part of the genes that influence traits important for mate recognition reside on the Xchromosome. Recently, two reviews revealed that Xchromosomal genes actually have a disproportionate effect on sex and reproduction related traits in humans and on sexually selected traits in animals. Using large molecular databases, Saifi and Chandra (1999) compared the linkage of mutations influencing traits related to sex and reproduction with all other traits. Their analysis shows that genes influencing traits related to sex and reproduction are several times more likely to be linked to the X-chromosome than other traits. With a different method using literature data on reciprocal crosses, Reinhold (1998) found that the influence of X-chromosomal genes is much stronger for traits that are likely to be under sexual selection than for other genes. On average, about one third of the difference between the two parental lines used for the reciprocal crosses was caused by X-chromosomal genes when sexually selected traits were considered. For those traits that were classified to be not under sexual selection, this value was much smaller—on average two percent of the difference was due to Xlinked genes—and was not significantly different from zero. Such a bias towards the X-chromosome can be expected for traits that are influenced by sexually antagonistic selection (Rice 1984) and for sex-limited traits that are under fluctuating selection (Reinhold 1999). Antagonistic selection occurs if the optimal phenotype differs for males and females and if a genetic correlation between the sexes prevents the phenotypes to reach their evolutionary stable optimum. Under sexually antagonistic selection, sexlinked traits can be expected to evolve faster than other traits because sex-linked traits almost always differ in their expression in the two sexes. Rare recessive X-chromosomal genes, for example, will always be expressed when they occur in the heterogametic sex (the sex that has two different sex chromosomes; in humans the males, they posses an X as well as a Y-chromosome) and will be expressed almost not at all in the homogametic sex. This difference in expression then provides the raw material selection can work on, so that preferential X-linkage
can be expected for traits under sexually antagonistic selection. Under fluctuating selection the fate of an allele is influenced by its geometric mean fitness a fitness measure that is equivalent to the case of effective interest rates when the interest rate on a financial investment fluctuates in time. This fitness measure is influenced by the extent of expression of a trait. Between autosomal and X-chromosomal genes there is such a difference in the extent of expression: (a) Xchromosomal genes coding for sex limited male traits in heterogametic males are only expressed to one third because the other two thirds of all X-chromosomes are present in females that do not express the genes under consideration; (b) Autosomal sex-limited genes, i.e. genes that are expressed in only one sex and do not reside on the sex chromosomes but lie on any of the other chromosomes, are, in contrast, expressed to one half provided they are not totally recessive. Due to this difference in expression, autosomal genes coding for the same phenotype as X-chromosomal genes have a disadvantage compared to X-linked genes. As a consequence, X-chromosomal genes coding for sex limited traits are expected to evolve more easily than autosomal genes (Reinhold 1999). The observed Xbias for sexually selected traits can accordingly be explained by the effect of fluctuating selection on these sex-limited traits. See also: Genetics and Mate Choice; Sex Hormones and their Brain Receptors; Sexual Attitudes and Behavior; Sexual Orientation: Biological Influences; Sexual Orientation: Historical and Social Construction; Sexuality and Gender; Y-chromosomes and Evolution
Bibliography Andersson M 1994 Sexual Selection. Princeton University Press, Princeton, NJ Bakker T C M 1999 The study of intersexual selection using quantitative genetics. Behaiour 136: 1237–65 Bakker T C M, Pomiankowksi A 1995 The genetic basis of female mate preferences. Journal of Eolutionary Biology 8: 129–71 Clutton-Brock T H, Vincent A C J 1991 Sexual selection and the potential reproductive rates of males and females. Nature 351: 58–60 Ewing A W 1969 The genetic basis of sound production in Drosophila pseudobscura and D. persimilis. Animal Behaiour 17: 555–60 Fisher R A 1930 The Genetical Theory of Natural Selection. Clarendon Press, Oxford, UK Hamilton W D, Zuk M 1982 Heritable true fitness and bright birds: A role for parasites. Science 218: 384–7 Kirkpatrick M 1996 Good genes and direct selection in the evolution of mating preferences. Eolution 50: 2125–40 Kirkpatrick M, Ryan M J 1991 The evolution of mating preferences and the paradox of the lek. Nature 350: 33–8 Møller A P, Alatalo R V 1999 Good-genes effects in sexual selection. Proceedings of the Royal Society of London Series B 266: 85–91
14011
Sexual Preference: Genetic Aspects Pomiankowski A, Møller A P 1995 A resolution of the lek paradox. Proceedings of the Royal Society of London Series B 260: 21–9 Reinhold K 1998 Sex linkage among genes controlling sexually selected traits. Behaioural Ecology and Sociobiology 44: 1–7 Reinhold K 1999 Evolutionary genetics of sex-limited traits under fluctuating selection. Journal of Eolutionary Biology 12: 897–902 Reinhold K, Engqvist L, Misof B, Kurtz J 1999 Meiotic drive and evolution of female choice. Proceedings of the Royal Society of London Series B 266: 1341–5 Rice W R 1984 Sex chromosomes and the evolution of sexual dimorphism. Eolution 38: 735–42 Rowe L, Houle D 1996 The lek paradox and the capture of genetic variance by condition dependent traits. Proceedings of the Royal Society of London Series B 263: 1415–21 Ryan M J, Keddy-Hector A 1992 Directional patterns of female mate choice and the role of sensory biases. The American Naturalist 139: S4–S35 Saifi G M, Chandra H S 1999 An apparent excess of sex- and reproduction-related genes on the human X chromosome. Proceedings of the Royal Society of London Series B 266: 203–9 Smith J M 1991 Theories of sexual selection. Trends in Ecology and Eolution 6: 146–51 Wedekind C, Furi S 1997 Body odour preferences in men and women: Do they aim for specific MHC combinations or simply heterozygosity? Proceedings of the Royal Society of London Series B 264: 1471–9 Westneat D F, Birkhead T R 1998 Alternative hypotheses linking the immune system and mate choice for good genes. Proceedings of the Royal Society of London B 265: 1065–73 Wilkinson G S, Presgraves D C, Crymes L 1998 Male eye span in stalk eyed flies indicate genetic quality by meiotic drive suppression. Nature 391: 276–79 Zahavi A 1975 Mate selection—a selection for a handicap. Journal of Theoretical Biology 53: 205–14
K. Reinhold
Sexual Risk Behaviors By the mid-1980s, the menace of AIDS had made indiscriminate sexual behavior a serious risk to health. Under certain conditions sexual behavior can threaten health and psychosocial well-being. Unwanted pregnancies and sexually transmitted diseases (STDs) such as gonorrhea and syphilis have long been potential negative consequences of sexual contact. Yet today HIV infection receives most attention. Because of this, the article will focus on the risk of HIV infection through sexual contact, while HIV infection through needle sharing, maternal-child transmission, and transfusion of blood and blood products will not be addressed. This article is divided into five sections. Section 1 defines sexual risk behavior and delimits the scope. Section 2 discusses the research methodology in this field. Section 3 presents epidemiological data on sexual behavior patterns and HIV infection status among relevant population groups. Section 4 examines 14012
the determinants of risk behavior. Section 5 concludes with a discussion of strategies for reducing risk exposure.
1. Definitions Sexuality is an innate aspect of humanity and is oriented toward sensory pleasure and instinctual satisfaction. It goes beyond pure reproduction and constitutes an important part of sensual interaction. Sexual behavior embodies the tension between biological determination, societal and cultural norms, and an individual’s personal life choices. Sexuality and sexual behavior are subject to constant change. Sexual mores and practices differ widely both within and between cultures, and also from epoch to epoch. One defining feature of sexuality is, therefore, the diversity of sexual conduct and the attitudes surrounding it. Sexual risk behavior can be described as sexual behavior or actions which jeopardize the individual’s physical or social health. High-risk sexual practices, such as unprotected intercourse with infected individuals, constitute unsafe sexual conduct, and therefore deserve to be classified as a health risk behavior or behavioral pathogen (Matarazzo et al. 1984) on a par with smoking, lack of exercise, unhealthy diet, and excessive consumption of alcohol. Among the STDs, which include gonorrhea, syphilis, genital herpes, condylomata, and hepatitis B among others, HIV infection and AIDS are by far the most dangerous. It has linked love and sexuality with disease and death. If an HIV-infected person fails to inform a partner of his or her serostatus (positive HIV antibody test), unprotected sexual intercourse resulting in the partner’s infection has also legal consequences. In general, unprotected sexual intercourse (i.e., without condoms) with an HIV-infected partner is of risk. HIV may be acquired asymptomatically and transmitted unknowingly. There is a ‘hierarchy’ in the level of risk involved in sexual practices. High-risk sexual practices include unprotected anal intercourse (especially for the receptive partner) and unprotected vaginal intercourse (again, more for the receptive partner) with infected individuals, as well as any other practices resulting in the entry or exchange of sexual fluids or blood among the partners. Oral sex, on the other hand, carries only a low risk (see Sexually Transmitted Diseases: Psychosocial Aspects). Petting is not considered sexual risk behavior. The risk of sexual behavior is a function of the partner’s infection status, the sexual practices employed, and the protective measures used. Sex with a large number of sexual partners is also risky because of the higher probability of coming into contact with an infected partner. The risk can be minimized or ruled out entirely by appropriate protective measures: low-risk sexual practices, use of condoms, and avoidance of sexual contact with HIV positive partners.
Sexual Risk Behaiors Sexual risk behavior also applies to the area of family planning. Unwanted pregnancies may occur if the sexual partners do not use safe methods of contraception. If the partners do not wish to have children, sexual contact risks an undesired outcome. Unwanted pregnancy can have consequences ranging from changes in life plans, in the partnership, and in decisions to abort with the ensuing emotional stress and medical risks. Deviant sexual behavior such as exhibitionism or pedophilia cannot be defined as risk behavior. The analysis of sexual risk behavior falls in the domains of psychology, sociology, medicine, and the public health sciences (Bancroft 1997, McConaghy 1993). Research is available in the areas of health psychology, health sociology and social science AIDS research (von Campenhourdt et al. 1997; see the journal AIDS Care; see HIV Risk Interentions). The terms used in studies more frequently than ‘sexual risk behavior’ are ‘HIV risk behavior,’ ‘HIV protective behavior,’ ‘AIDS-preventive behavior,’ ‘HIV-related preventive behavior,’ ‘safer sexual behavior,’ ‘safe sex,’ or ‘safer sex’ (DiClemente and Peterson 1994, Oskamp and Thompson 1996). These descriptions ought to be given preference over the term ‘sexual risk behavior’ in order to prevent pathologizing certain sexual practices; it is not the sexual practice as such which presents the risk, but the partner’s infection status. Although this term will be used further here, this issue ought to be kept in mind.
2. Research on Sexual Risk Behaior Assessment of sexual risk behavior is complicated by problems in research methodology. Fundamentally, research in sexology faces the same methodological problems as other social sciences. Studies of sexual behavior require special survey instruments and must take into account factors influencing data acquisition and results (see Bancroft 1997). The most important methods of data collection include questionnaires, personal interviews, telephone surveys, and self-monitoring. None of these methods is clearly superior to the others. The quality of retrospective data depends on the selected period, the memory capacity of study participants, the frequency of the behavior, and whether the behavior is typical. In general, in research on sexuality no adequate external criteria are available to validate reporting. Psychophysiological data exist only to a limited extent, and field experiments and participatory observations are usually not applicable. Since the different survey methods and instruments involve differential advantages and disadvantages, a combination of methods would be appropriate. However, this is often not feasible due to practical constraints including limited resources and constrained access to target groups. When conducting scientific surveys on sexual behavior, it is important to inform and instruct study
participants about the purpose of the investigation. Questions must be phrased in a neutral and unprejudiced manner. For sensitive issues, such as taboo topics and unusual sexual practices, the phrasing of questions and their position in the questionnaire are critical. When surveying populations with groupspecific language codes (e.g., minorities that have a linguistic subculture), it is necessary to discuss sexual terms with the interviewees beforehand. The willingness to participate in the study and to answer survey questions can be influenced by the fear of reprisals if the answers become public. Depending on the objectives of the study and the study sample, relative anonymity (via telephone interview) can play an important role in the willingness to participate and the openness of responses. Independent of the selected survey instrument, exaggerations, understatements, and socially desirable answers are to be expected, particularly to questions about sexual risk behavior. Study participants report their own behavior based on role expectations and their self-image. Thus, it is important to assess and control for appropriate motivational and dispositional variables, such as social desirability and the tendency for self-disclosure and attitudes toward sexuality. The person of the researcher, in particular his or her sex, age, and sexual orientation, has an important influence on response behavior. For example, for a heterosexual study sample it may be advisable that male participants be questioned by a male interviewer and female participants by a female interviewer. A heterosexual interviewer must be prepared not to be accepted by homosexual participants. This applies in particular when questions address highly intimate sexual experiences and sexual risk behavior. Generalizability is one of the most serious problems when reporting data from scientific surveys on sexuality. Access to the study sample and the willingness to participate are more closely linked to relevant personal characteristics of the participants (sexual experiences, attitude toward sexuality) compared to other research. Participants tend to show, for example, greater selfdisclosure, a broader repertoire of sexual activities, less guilt, and less anxiety than nonparticipants. Since randomized or quota selection are rarely possible, an exact description of the sampling frame is essential to permit an estimate of the representativity of the sample and the generalizability of the results. Generally, study findings are often based on reports by individuals who are, to a substantial degree, self-selected. This must be taken into consideration when evaluating the findings of studies on sexual behavior.
3. Epidemiology Data on the epidemiology of sexual risk behavior can be acquired from different sources. Numbers can be obtained through studies on sexual risk behavior, focusing on the behavior itself. As mentioned above, 14013
Sexual Risk Behaiors methodological problems often plague such studies. Even if rates of condom use and numbers of sexual partners are assessed correctly, they may not provide the specific information needed to present the epidemiology of sexual risk behavior. The finding that about 50 percent of heterosexual men do not regularly use condoms is only relevant if they do not use condoms in sexual encounters with potentially infected partners. This information, however, is rarely available from existing studies. Numbers can also be extrapolated from incidence and prevalence rates on sexually transmitted disease or unwanted pregnancies. These data focus on consequences rather than the risk behavior itself, and extrapolation to the prevalence of risk behaviors is complicated by the fact that it is sometimes difficult to distinguish how an infection was transmitted (sexual contact, transfusions, or needle sharing). In addition, reporting may be incomplete or inaccurate for the following reasons: (a) not all countries require compulsory registration of HIV infections, (b) cases are often recorded anonymously which may result in multiple registrations of the same case, and (c) there may be a high number of unreported cases that do not find their way into the statistics. Despite these methodological problems, epidemiological data are necessary and useful for behavioral research. The available epidemiological data on sexual risk behavior from European and US American sources are presented below for five different subgroups: adolescents, ethnic groups, homosexuals, heterosexuals, and prostitutes. These subgroups are not exhaustive, but they elucidate the differences and specificity of epidemiological data on sexual risk behavior (see Sexually Transmitted Diseases: Psychosocial Aspects). 3.1 Adolescents Adolescents comprise a special group concerning sexual risk behavior. Their initiation into sexuality may shape their sexual behavior for many years to come. They have as yet no rigid sexual behavior patterns and are subject to influences from their peers, parents, the media, and other sources. Compared to the 1970s, teenagers are sexually active at a younger age. Many teenagers (about 25 percent of males and 40 percent of females) have had sexual intercourse by the time they reach 15 years of age, and the mean age of first sexual intercourse is between 16 and 17 years. Because adolescents are sexually active earlier in their lives, they also engage in sexual risk behavior at an earlier age. The majority have several short-term sexual relationships, and by the end of their teens about half report having had more than four partners. The majority of adolescents report being exclusively heterosexual, but an increasing number of teenage males report being bisexual or homosexual. Data on infection rates show that sexually transmitted diseases are widespread among adolescents. 14014
The human papilloma virus (HPV) is likely to be the most common STD among adolescents, with a prevalence of 28–46 percent among women under the age of 25 in the US (Centers for Disease Control and Prevention 2000). The prevalence of HIV infection is increasing slightly among adolescents, but accurate data are difficult to obtain. Due to its long incubation period, those with AIDS in their twenties probably contracted the virus as adolescents, and such cases are on the rise. Teenage pregnancy is still an issue, although numbers generally have decreased. The numbers are still high, especially in the US, where 11 percent of all females between ages 15 and 19 become pregnant (Adler and Rosengard 1996). Data on behavior indicate that about 75 percent of adolescents use contraception. Adolescents in steady relationships predominantly use the pill, while adolescents with casual sexual encounters mainly use condoms. Most adolescents are aware of the risk of pregnancy when sexually active, and they use condoms for the purpose of contraception rather than for protection against STDs. Most of them have sufficient knowledge about HIV infection and AIDS, although erroneous assumptions still prevail. Personal vulnerability to AIDS is perceived to be fairly low, and awareness of other sexually transmitted diseases is practically nonexistent. Alcohol and drug use, common among adolescents, further influence sexual behavior among adolescents including reduced frequency of condom use. 3.2 Ethnic Groups There are distinct differences in HIV infection among different ethnic groups. Data on infection rates are available mainly from studies conducted in the US. These show that African–Americans face the highest risk of contracting HIV. Despite making up about 12 percent of the US population, the prevalence among African–Americans is 57 percent of HIV diagnoses and 45 percent of AIDS cases. Almost two-thirds of all reported AIDS cases in women are among African– Americans. The incidence rate of reported AIDS cases is eight times the rate of whites (Centers for Disease Control and Prevention 1999). The Hispanic population has the next highest prevalence rates. Hispanics accounted for about 20 percent of the total number of new AIDS cases in 1998, while their population representation was about 13 percent. Their rate is about four times that of whites. It is likely that differences in sexual behavior underlie these statistics, but most studies of behavior have focused on one specific group rather than comparing them with each other. Possible reasons for differences in infection rates could be that (a) members of ethnic groups often have a lower socioeconomic status and have less access to health care, (b) they are less educated and score lower on HIV risk behavior knowledge, (c) they com-
Sexual Risk Behaiors municate less with partners about sexual topics, and men often have a dominant role in relationships, such that women have difficulty discussing promiscuity and condom use, and (d) men often are less open about their sexual orientation and their HIV status compared to white men.
3.3 Homosexuals Homosexual men (men who have sex with men) are a high-risk group for contracting HIV infection. Estimates suggest that 5 percent to 8 percent of all homosexual men are HIV positive. Unprotected sexual contact among homosexual men accounts for about 50 percent of all HIV infections (Robert KochInstitut 1999). Compared to other groups, homosexual men are more likely to engage in sexual risk behavior, including receptive or insertive anal sex and higher numbers of sexual partners. However, prevention campaigns seem to have influenced the sexual behavior of this group. An estimated three-quarters of homosexual men now report using condoms, especially with anonymous partners. There also seems to be a decline in the overall number of sexual partners. However, some homosexual men continue to expose themselves to considerable risk of HIV infection. Although more homosexual men are using condoms, they often do not use them consistently in all potentially risky sexual encounters. Also, especially those who still engage in unprotected sex are usually sexually very active and\or have multiple partners. Studies on sexual behavior often overlook this fact when reporting rates of condom use. Homosexual women are at practically no risk of HIV infection through sexual behavior.
condoms are used. Up to 23 percent of heterosexual men and 35 percent of heterosexual women report having had two or more sexual partners in the past five years, and up to 6 percent of men and 3 percent of women admit to extramarital sex in the past 12 months. Thus, condom use in these settings is of greatest relevance (Johnson et al. 1994, Laumann et al. 1994). Most individuals in monogamous relationships use condoms as a method of contraception rather than as a means of protection against infection.
3.5 Prostitutes There seems to be a distinction between ‘drug-related’ and ‘professional’ prostitutes. HIV infections are fairly uncommon among professional prostitutes in developed countries; studies report rates of 0.5 percent to 4 percent. Rates of infection among drug-related prostitutes are much higher, with about 30 percent being HIV infected. Behavioral patterns differ between the two groups. Drug-related prostitutes may be forced by either financial need or dependence on a drug supplier to conform to the wishes of their clients, often including sexual intercourse without condoms. This is less the case for professional prostitutes who have more of a choice in their clients and who can insist on using condoms. Rates for unprotected sex among drug-related prostitutes range from 41 percent to 74 percent. In contrast, rates for professional prostitutes range from 20 percent to 50 percent. Rates may be even higher for male prostitutes (Kleiber and Velten 1994).
4. Determinants of Risk Behaior 3.4 Heterosexuals Heterosexuals in monogamous relationships are at low risk for contracting HIV. However, often both partners have had previous sexual contacts, and there is no guarantee that all of those past partners were not infected. Most cases of HIV transmission in this setting are male to female and result from the man’s past highrisk sexual contacts, including homosexual relations and encounters with prostitutes. Sex tourism to countries with high rates of HIV adds particular risk. Differences in sexual behavior between men and women, combined with the relatively higher efficiency of HIV transmission from male to female, have the consequence that in developed countries only 5 percent of all HIV infections in men are due to unprotected heterosexual intercourse, whereas for women the figure is 33 percent. Rates of condom use among heterosexual couples vary from study to study, but about 40–60 percent of sexually active individuals report not using condoms. No conclusive data are available in which settings
Models that attempt to describe and explain HIV risk and protective behavior include the following factors: cultural factors (e.g., ethnic group, social norms), social and environmental factors (e.g., membership in subgroups, knowing HIV-infected individuals), demographic factors (socioeconomic status, marital status), biographic factors (e.g., sexual orientation, attitude toward health), and psychosocial factors (e.g., level of knowledge about STD risk, self-efficacy, attitude toward condoms). Sociological models emphasize the way risk behavior is influenced by social class, educational level and the overall social context. Social disadvantage often goes hand in hand with lack of opportunities for health maintenance and medical care, and with a higher level of risk-taking behavior. Sexual behavior is imbedded in the mode of living of the individual and is closely tied in with the social environment. Some sociological theories focus on the aspect of communication in intimate relationships and economic factors (e.g., financial dependence). Especially among subgroups such as substance abusers, prostitutes and the socioecono14015
Sexual Risk Behaiors mically disadvantaged, social and economic conditions can be the central factor that controls behavior. Psychological models focus on determinants of risk behavior that entail processes taking place within the individual. Although a large number of studies have been published since the mid-1980s, and despite a long tradition of research into risk-taking behavior even before the era of AIDS (Yates 1992), there is no comprehensive and sufficiently well-founded theory to explain sexual risk behavior (Bengel 1996). Most significant in the field of sexual risk behavior are the social-psychological models (e.g., Theory of Reasoned Action and Planned Behavior, Theory of Protection Motivation, and the AIDS Risk Reduction Model; see Health Behaior: Psychosocial Theories). For the purpose of this article the AIDS Risk Reduction Model is presented because it entails the most direct links to preventive strategies. It distinguishes between (a) demographic and personality variables, (b) labeling stage variables, (c) commitment stages variables, and (d) enactment stage variables (Catania et al. 1990). Demographic factors such as gender, age, and education, as well as personality factors such as impulsivity or the readiness to take risks contribute little in explaining sexual risk behavior (Sheeran et al. 1999). Also of limited predictive value are labeling stage variables such as knowledge of AIDS, sexual experience, and threat appraisal or risk perception. The assumption is that each individual makes a personal assessment of the risk of infection or disease (perceived vulnerability and perceived severity). A heterosexual, monogamous male, for instance, may perceive the menace of AIDS as a severe health issue, but may feel personally invulnerable or nonsusceptible. However, some individuals who conduct manifest sexual risk behaviors may underestimate their personal risk as compared to others and thereby perform an ‘optimistic bias’ (see Health Risk Appraisal and Optimistic Bias). Many studies have shown significant but tenuous correlations between threat appraisal variables and risk behavior. Commitment stage variables influence behavior more substantially and include: (a) social influence: perception of social pressure from significant others to use or not use a condom and of a sexual partner’s attitude toward condoms; (b) beliefs about condoms: attitudes toward condoms, intentions, and perceived barriers to condom use; (c) self-efficacy: confidence in the ability to protect oneself against HIV; and (d) pregnancy prevention: condom use for contraceptive purposes. Social pressure, self-efficacy, attitudes toward condoms, as well as previous use of condoms correlate closely and significantly with HIV protective behavior. The extent to which protective or risky behavior is displayed also depends on situative and interactive factors, the enacting stage variables. Particularly among casual sexual encounters, lack of immediate 14016
condom availability can be the decisive determinant of risk behavior. The nature of the relationship and, in particular, the level of communication about safe sex play a central role in risk behavior. Can the partners communicate about HIV and protective behavior, or are they afraid of offending the partner and jeopardizing what could otherwise be a valued romance? The significance of the influencing factors above varies, depending on the target behavior (e.g., condom use, sexual practices) and on target group (e.g., homosexuals, adolescents, prostitutes; see, e.g., Flowers et al. 1997). All available theoretical approaches and models assume that HIV-protective behavior is governed by a rational decision process. Emotional and motivational factors, as well as planned behavior and action control, largely have been disregarded in these models and have also been insufficiently researched. After experiencing a risk situation, individuals change both their risk perception and their appraisal of the options available for risk management. Especially when uncertainty or fear about HIV infection is high, cognitive coping (e.g., ‘I know my partner’s friends, so he is not HIV infected’) and behavioral coping (e.g., seeking HIV antibody testing) are deployed.
5. Strategies for Behaioral Change Reducing the rates of infections with HIV and other sexually transmitted diseases and preventing unwanted pregnancies constitute major tasks for health science and policy. Although sexual risk behavior in most important target groups is difficult to assess, and explanatory models lack empirical validation and are incomplete, preventive programs must be developed and implemented (Kelly 1995). A societal agreement on target groups and on methods used in such prevention programs is essential. The need to prevent the spread of AIDS has triggered lively and controversial discussions in many countries: should the emphasis be on information and personal responsibility, or should regulatory measures be employed to stem the tide of the disease? Controversy has raged among scientists about the ability to influence sexual behavior and the right of the state to intervene in such intimate affairs. There is agreement, however, that sexual risk behavior cannot be regarded as an isolated mode of personal conduct, but must be seen in the context of an individual’s lifestyle and social environment. Prevention programs promote the use of condoms as the basic method of protection. They promulgate a message of personal responsibility to prevent risk: ‘protect yourself and others.’ AIDS prevention programs in Western European and North American countries have pursued two objectives: (a) to convey basic information on modes of transmission and methods of protection and (b) to motivate the popu-
Sexual Risk Behaiors lation to assess individual risk and undertake behavioral change if needed. These recommendations start by urging ‘safe sex,’ that is, use of condoms and avoidance of sexual practices in which body fluids are exchanged, in any situation where infection is a potential risk. Individuals are also advised to reduce the number of sexual partners, to avoid anonymous sexual partners and to reduce the use of substances that may result in loss of control. Prevention programs use mass communicative, personal, and structural measures. Mass communication involves dissemination of information via media such as radio, TV, newspaper, and posters, as well as distribution of brochures and informational leaflets. These media may be intended for all audiences or may be aimed at a specific group. The information is conveyed in simple, concrete, everyday language, describing the modes of transmission and the options available for protection. Personal measures include telephone hotlines, events, and seminars for special target groups, street-working, individual counseling, and information for sex tourists. These person-toperson prevention programs aim at fostering recognition of the problem, improving the level of knowledge, and changing attitudes, intentions, and behaviors of members of particular target groups. Structural measures include the provision of sterile syringes for intravenous drug users, easy access to condoms, and the improvement of the social situation of prostitutes. Preventive measures must be tailored to the lifestyle, environment, and language of each target group given its specific risk behavior pattern and risk of HIV infection. Programs that rely entirely on the generation of fear enhance risk perception but offer no coping alternatives. Specific messages alone, for instance an appeal for condom use, are also often inadequate to bring about (permanent) behavior changes. Communication among sexual partners should be encouraged as one of the crucial target parameters of prevention. Evaluation of prevention programs has suggested that they have been successful at conveying the most important information about AIDS. In certain sections of the population there are nevertheless uncertainties, irrational assumptions, and false convictions about risk of infection through such activities as kissing or work contacts. The acceptance level of this information varies widely depending on the target group and is lowest among intravenous drug users. As expected, outside of those groups at especially high risk, the greatest fear of infection prevails among persons below the age of 35 and among singles. Only moderate behavioral change has been found among intravenous drug users. Rapid and significant changes in behavior have been found among homosexual males, especially in cities (fewer sexual partners, fewer high-risk sexual practices, increased use of condoms). Yet only a minority practices safe sex all the time, and some of the behavioral changes are relatively unstable.
Some experts fear that the messages of preventive campaigns are wearing off and that some individuals are now less concerned about becoming infected than in the past. This may be due to impact of the new antiviral drugs, which have changed the perception of HIV from that of a death sentence to that of a chronic, manageable disease. See also: Adolescent Health and Health Behaviors; Health Behavior: Psychosocial Theories; Health Behaviors; Health Risk Appraisal and Optimistic Bias; HIV Risk Interventions; Regulation: Sexual Behavior; Sexual Attitudes and Behavior; Sexual Behavior: Sociological Perspective; Sexually Transmitted Diseases: Psychosocial Aspects
Bibliography Adler N E, Rosengard C 1996 Adolescent contraceptive behavior: Raging hormones or rational decision making? In: Oskamp S, Thompson S C (eds.) Understanding and Preenting HIV Risk Behaior. Safer Sex and Drug Use. Sage, Thousand Oaks, CA Bancroft J (ed.) 1997 Researching Sexual Behaior. Methodological Issues. Indiana Press, Bloomington, IN Bengel J (ed.) 1996 Risikoerhalten und Schutz or AIDS: Wahrnehmung und Abwehr des HIV-Risikos—Situationen, Partnerinteraktionen, Schutzerhalten (Risk behavior and protection against AIDS: Perception of and defense against the risk of HIV—situations, partner interactions, protective behavior). Edition Sigma, Berlin von Campenhourdt L, Cohen M, Guizzardi G, Hausser D (eds.) 1997 Sexual Interactions and HIV-Risk. Taylor & Francis, London Catania J A, Kegeles S M, Coates T J 1990 Towards an understanding of risk behavior: An AIDS risk reduction model. Health Education Quarterly 17: 53–72 Centers for Disease Control and Prevention 1999 HIV\AIDS among African–Americans (On-line). Available: http:\\www. cdc.gov\hiv\pubs\facts\afam.pdf Centers for Disease Control and Prevention 2000 Tracking the hidden epidemics. Trends in the STD epidemics in the United States (On-line). Available: http:\\www.cdc.gov\nchstp\ dstd\StatsITrends\StatsIandITrends.htm DiClemente R J, Peterson J L (eds.) 1994 Preenting AIDS. Plenum, New York Flowers P, Sheeran P, Beail N, Smith J A 1997 The role of psychosocial factors in HIV risk-reduction among gay and bisexual men: A quantitative review. Psychology and Health 12: 197–230 Johnson A M, Wadsworth J, Wellings K, Field J 1994 Sexual Attitudes and Lifestyles. Blackwell, Oxford, UK Kelly J A 1995 Changing HIV Risk Behaior: Practical Strategies. Guilford Press, New York Kleiber D, Velten D 1994 Prostitutionskunden. Eine Untersuchung uW ber soziale und psychologische Charakteristika on Besuchern weiblicher Prostituierter in Zeiten on AIDS. (Clients of prostitutes. An investigation into social and psychological characteristics of customers of female prostitutes in the era of AIDS). Nomos Verlagsgesellschaft, BadenBaden
14017
Sexual Risk Behaiors Laumann E O, Gagnon J H, Michael R T, Michaels S 1994 The Social Organization of Sexuality. Sexual Practices in the United States. University of Chicago Press, Chicago Matarazzo J D, Weiss S M, Herd J A, Miller N E, Weiss S M (eds.) 1984 Behaioral Health: A Handbook of Health Enhancement and Disease Preention. Wiley, New York McConaghy N (ed.) 1993 Sexual Behaior. Problems and Management. Plenum, New York Oskamp S, Thompson S C (eds.) 1996 Understanding and Preenting HIV Risk Behaior. Safer Sex and Drug Use. Sage, Thousand Oaks, CA Robert Koch-Institut 1999 AIDS\HIV Halbjahresbericht I\99. Bericht des AIDS-Zentrums im Robert Koch-Institut uW ber aktuelle epidemiologische Daten (AIDS\HIV semiannual report I\99. Report by the AIDS Center at the Robert KochInstitute on Current Epidemiological Data). Berlin Sheeran P, Abraham C, Orbell S 1999 Psychosocial correlates of heterosexual condom use: A meta-analysis. Psychology Bulletin 125: 90–132 Yates J F 1992 Risk-taking Behaior. Wiley, Chichester, UK
J. Bengel
Sexuality and Gender The meaning of the terms sexuality and gender, and the ways that writers have theorized the relationship between the two, have changed considerably over the last 40 years. The term sexuality has various connotations. It can refer to forms of behavior, it may include ideas about pleasure and desire, and it is also a term that is used to refer to a person’s sense of sexual being, a central aspect of one’s identity, as well as certain kinds of relationships. The concept of gender has also been understood in relation to varying criterion. Prior to the 1960s, it was a term that referred primarily to what is coded in language as masculine or feminine. The meaning of the term gender has subsequently been extended to refer to personality traits and behaviors that are specifically associated either with women or men; to any social construction having to do with the male\female distinction, including those which demarcate female bodies from male bodies; to gender being thought of as the existence of materially existing social groups ‘men’ and ‘women’ that are the product of unequal relationships. In this latter sense, gender as a socially meaningful category is dependant on a hierarchy already existing in any given society, where one class of people (men) have systematic and institutionalized power and privilege over another class of people (women) (Delphy 1993). The term patriarchy or, more recently, the phrase ‘patriarchal gender regimes,’ is used as a way of conceptualizing the oppression of women which results. More recently, the notion of gender as social practice has emerged and is associated with the work of Judith Butler (1990), who argues that gender is performatively enacted through a continual citation 14018
and reiteration of social norms. Butler offers a similar analysis of sexuality, claiming that far from being fixed and naturally occurring, (hetero) sexuality is ‘unstable,’ dependant on ongoing, continuous, and repeated performances by individuals ‘doing heterosexuality,’ which produce the illusion of stability. There is no ‘real’ or ‘natural’ sexuality to be copied or imitated: heterosexuality is itself continually in the process of being reproduced. Such ideas are part of the establishment of a new canon of work on sexuality and gender that has emerged since the 1960s. This newer approach differs radically from the older tradition put forward by biologists, medical researchers, and sexologists, which developed through the late nineteenth century and was profoundly influential during the first half of the twentieth century. The traditional approach to understanding sexuality and gender has been primarily concerned with establishing ‘natural’ or ‘biological’ explanations for human behavior. Such analyses are generally referred to as essentialist. More recent approaches, although not necessarily denying the role of biological factors, have emphasized the importance of social and cultural factors; what is now commonly known as the social constructionist approach. The sociological study of sexuality emerged in the 1960s and 1970s and was informed by a number of theoretical approaches that were significant at the time; notably symbolic interactionism, labeling theory and deviancy theory, and feminism. The work of writers such as John Gagnon and William Simon (1967, 1973) in the US, and Mary McIntosh (1968) and Kenneth Plummer (1975) in the UK, was particularly important in establishing a different focus for thinking about sexuality. A primary concern of such works was to highlight how sexuality is social rather than natural behavior and, as a consequence, a legitimate subject for sociological enquiry. Gagnon and Simon developed the notion of sexual scripts which, they argued, we make use of to help define who, what, where, when, how, and—most antiessentialist of all—why we have sex. A script refers to a set of symbolic constructs which invest actors and actions, contexts and situations, with ‘sexual’ meaning—or not as the case may be. People behave in certain ways according to the meanings that are imputed to things; meanings which are specific to particular historical and cultural contexts; meanings that are derived from scripts learnt through socialization and which are modified through ongoing social interactions with others. Most radical of all, Gagnon and Simon claimed that not only is sexual conduct socially learnt behavior, but the reason for wanting to engage in sexual activity, what in esssentialist terms is referred to as sexual ‘drive’ or ‘instinct,’ is in fact a socially learnt goal. Unlike Freud, who claimed the opposite to be true, Gagnon and Simon suggested that social motives underlie sexual actions. They saw gender as central to this and detailed how in con-
Sexuality and Gender temporary Western societies sexual scripts are different for girls and boys, women and men. Here gender is seen as a central organizing principle in the interactional process of constructing sexual scripts. In this sense gender can be seen as constitutive of sexuality, at the same time as sexuality can be seen as expressive of gender. Thus, for example, Gagnon and Simon argue that men frequently express and gratify their desire to appear ‘masculine’ through specific forms of sexual conduct. For example, for young men in most Western cultures first sexual intercourse is a key moment in becoming a ‘real man,’ whereas this is not the same for young women. It is first menstruation rather than first heterosex that marks being constituted as ‘women.’ Another important contribution to the contemporary study of sexuality, which has posed similar challenges to essentialist theories, is the discourse analysis approach. One example is the work of Michel Foucault (1979) and his followers, who claim that sexuality is a modern ‘invention’ and that by taking ‘sexuality’ as their object of study, various discourses, in particular medicine and psychiatry, have produced an artificial unity of the body and its pleasures: a compilation of bodily sensations, pleasures, feelings, experiences, and actions which we call ‘the sexual.’ Foucault understands sex not as some essential aspect of personality governed by natural laws that scientists may discover, but as an idea specific to certain cultures and historical periods. Foucault draws attention to the fact that the history of sexuality is a history of changing forms of regulation and control over sexuality. What ‘sexuality’ is defined as, its importance for society, and to us as individuals may vary from one historical period to the next. Furthermore, Foucault argues, as do interactionists, that sexuality is regulated not only through prohibition, but is produced through definition and categorization, in particular through the creation of sexual categories such as, for example, ‘heterosexual’ and ‘homosexual.’ Foucault argues that, while both heterosexual and homosexual behavior has existed in all societies, there was no concept of a person whose sexual identity is ‘homosexual’ until relatively recently. Although there is some disagreement among writers as to precisely when the idea of the homosexual person emerged, it has its origins in the seventeenth to nineteenth centuries, with the category lesbian emerging somewhat later than that of male homosexuality. Such analyses have also highlighted how medical and psychiatric knowledge during the late nineteenth and early twentieth centuries was a key factor in the use of the term ‘homosexual’ to designate a certain type of person rather than a form of sexual conduct. A major criticism of Foucault’s work is that insufficient attention is given to examining the relationship between sexuality and gender. Feminist writers in particular have pointed out how in Foucault’s account of sexuality there is little analysis
of how women and men often have different discourses of sexuality. Sexuality is employed as a unitary concept and, such critic’s claim, that sexuality is male. Despite such criticisms, many feminists have utilized Foucauldian perspectives. A further challenge to essentialist ideas about sexuality and gender is associated with psychoanalysis, in particular the reinterpretation of Freud by Jacques Lacan. For Lacan and his followers sexuality is not a natural energy that is repressed; it is language rather than biology that is central to the construction of ‘desire.’ Lacanian psychoanalysis has had a significant influence on the development of feminist theories of sexuality and gender, although some writers have been critical of Lacan’s work (Butler 1990). At the same time as social scientists and historians were beginning to challenge the assumption that sexual desires and practices were rooted in ‘nature,’ more and more people were beginning to question dominant ideas about gender roles and sexuality. The late 1960s\early 1970s saw the emergence of both women’s and gay and lesbian liberation movements in the US and Europe. An important contribution to analyses of sexuality and gender at that time was the distinction feminists, along with some sociologists and psychologists, sought to make between the terms sex and gender. Sex referred to the biological distinction between females and males, whereas gender was developed and used as a contrasting term to sex. Gender refers to the social meanings and value attached to being female or male in any given society, expressed in terms of the concepts femininity and masculinity, as distinct from that which is thought to be biologically given (sex). Feminists have used the sex\gender distinction to argue that although there may exist certain biological differences between females and males, societies superimpose different norms of personality and behavior that produce ‘women’ and ‘men’ as social categories. It is this reasoning that led Simone de Beauvoir (1964) to famously remark ‘One is not Born a Woman.’ More recently, a new understanding of gender has emerged. Rather than viewing sex and gender as distinct entities, sex being the foundation upon which gender is superimposed, gender has increasingly been used to refer to any social construction to do with the female\male binary, including male and female bodies. The body, it is argued, is not free from social interpretation, but is itself a socially constructed phenomenon. It is through understandings of gender that we interpret and establish meanings for bodily differences that are termed sexual difference (Nicholson 1994). Sex, in this model, is subsumed under gender. Without gender we could not read bodies as differently sexed; gender provides the categories of meaning for us to interpret how the body appears to us as ‘sexed.’ Feminists have critiqued essentialist understandings of both sexuality and gender and have played an 14019
Sexuality and Gender important role in establishing a body of research and theory that supports the social constructionist view. However, feminist theories of sexuality are not only concerned with detailing the ways in which our sexual desires and practices are socially shaped; they are also concerned to ask how sexuality relates to gender and, more specifically, what the relationship is between sexuality and gender inequality? It is this question which perhaps more than any other provoked discussion and controversy between feminists during the 1970s and 1980s. Most feminists would agree that historically women have had less control in sexual encounters than their male partners and are still subjected to a double standard of sexual conduct that favors men. It is, for example, seen as ‘natural’ for boys to want to have sex and with different partners, whereas exactly the same behavior that would be seen as understandable and extolled in a boy is censured in a girl. Sexually active women are subject to criticism and are in danger of being regarded as a ‘slut’ or a ‘slag.’ Where feminists tend to differ is over the importance of sexuality in understanding gendered power differences. For many radical feminists sexuality is understood to be one of the key mechanisms through which men have regulated women’s lives. Sexuality, as it is currently constructed, is not merely a reflection of the power that men have over women in other spheres, but is also productive of those unequal power relationships. Sexuality both reflects and serves to maintain gender divisions. From this perspective the concern is not so much how sexual desires and practices are affected by gender inequalities, but, more generally, how constructions of sexuality constrain women in many aspects of their daily lives from restricting their access to public space to shaping health, education, work, and leisure opportunities (Richardson 1997). Fears of sexual violence, for instance, may result in many women being afraid to go out in public on their own, especially at night. It is also becoming clearer how sexuality affects women’s position in the labor market in numerous ways; from being judged by their looks as right or wrong for the job, to sexual harassment in the workplace as a common reason given by women for leaving paid employment. Other feminists have been reluctant to attribute this significance to sexuality in determining gender relations. They prefer to regard the social control of women through sexuality as the outcome of gendered power inequalities, rather than its purpose. There is then a fundamental theoretical disagreement within feminist theories of sexuality, over the extent to which sexuality can be seen as a site of male power and privilege, as distinct from something that gendered power inequalities act upon and influence. In the 1990s a new perspective on sexuality and sexual politics emerged fueled by the impact of HIV and AIDS on gay communities and the anti-homosexual feelings and responses that HIV\AIDS revita14020
lized, especially among the ‘moral right.’ One response by scholars was queer theory; a diverse body of work that aims to question the assumption in past theory and research that heterosexuality is ‘natural’ and normal. Queer theory is often identified, especially in its early stages of development, with writers associated with literary criticism and cultural studies, and generally denotes a postmodernist approach to understanding categories of gender and sexuality. The work of Eve Sedgwick (1990), Judith Butler (1990), and Teresa de Lauretis (1991), for instance, might be taken as key to the development of queer theory. A principal characteristic of queer theory is that it problematizes sexual and gender categories in seeking the deconstruction of binary divides underpinning and reinforcing them such as, for instance, woman\man; feminine\masculine; heterosexual\homosexual; essentialist\constructionist. While queer theory aims to develop existing notions of gender and sexuality, there are broader implications of such interventions. As Sedgwick (1990) and others have argued, the main point is the critique of existing theory for its heterosexist bias rather than simply the production of theory about those whose sexualities are marginalized such as, for example, lesbians and gay men. Sexuality is the primary focus for analysis within queer theory and, while acknowledging the importance of gender, the suggestion is that sexuality and gender can be separated analytically. In particular, queer theorists are centrally concerned with the homo\heterosexual binary and the ways in which this operates as a fundamental organizing principle in modern societies. The emphasis is on the centrality of homosexuality to heterosexual culture and the ways in which the hetero\homo binary serves to define heterosexuality at the center, with homosexuality positioned as the marginalized ‘other.’ Feminist perspectives, on the other hand, have tended to privilege gender in their analyses—the woman\man binary— and, as I have already outlined above, are principally concerned with sexuality insofar as it is seen as constitutive, as well as determined by, gendered power relations. Of particular significance for the development of our understanding of the relationship between queer and feminism is a rethinking of the distinction between sexuality and gender. The relationship between sexuality and gender has, then, been theorized in different ways by different writers. These can be grouped into five broad categories. First, some theories place greater emphasis on gender insofar as concepts of sexuality are understood to be largely founded upon notions of gender (Gagnon and Simon 1973, Jackson 1996). For example, it is impossible to talk meaningfully about heterosexuality or homosexuality without first having a notion of one’s sexual desires and relationships as being directed to a particular category of gendered persons. Others propose a different relationship, where sexu-
Sexuality and Geography ality is understood to be constitutive of gender. The radical feminist Catherine MacKinnon (1982), for example, suggests that it is through the experience of ‘sexuality,’ as it is currently constructed, that women learn about gender, learn what ‘being a woman’ means. ‘Specifically, ‘‘woman’’ is defined by what male desire requires for arousal and satisfaction and is socially tautologous with ‘‘female sexuality’’ and the ‘‘female sex’’’ (MacKinnon 1982). A third way of understanding the relationship between sexuality and gender, which moves away from causal relationships where one is seen as more or less determining of the other, is one which allows for the possibility that the two categories can be considered as analytically separate, if related, domains. Gayle Rubin (1984) in her account of what she terms a ‘sex\gender system,’ and others who have been influenced by her work, such as Eve Sedgwick (1990), make this distinction between sexuality and gender, which means that it is possible for sexuality to be theorized apart from the framework of gender difference. This is a model favored by many queer theorists. Alternatively, we may reject all of these approaches in favor of developing a fourth model which relies on the notion that sexuality and gender are inherently co-dependent and may not usefully be distinguished one from the other (Wilton 1996). The stage in our contemporary understandings of sexuality and gender are such that there can be no simple, causal model that will suffice to explain the interconnections between them. However, rather than wanting to privilege one over the other, or seeking to analytically distinguish sexuality and gender or, alternatively, to collapse the two, we might propose a fifth approach which investigates ‘their complex interimplication’ (Butler 1997). It is this articulation of new ways of thinking about sexuality and gender in a dynamic, historically, and socially specific relationship that is one of the main tasks facing both feminist and queer theory (Richardson 2000). See also: Female Genital Mutilation; Feminist Theory: Psychoanalytic; Feminist Theory: Radical Lesbian; Gay, Lesbian, and Bisexual Youth; Gay\Lesbian Movements; Gender and Reproductive Health; Gender Differences in Personality and Social Behavior; Gender Ideology: Cross-cultural Aspects; Heterosexism and Homophobia; Lesbians: Historical Perspectives; Male Dominance; Masculinities and Femininities; Prostitution; Queer Theory; Rape and Sexual Coercion; Rationality and Feminist Thought; Regulation: Sexual Behavior; Sex Offenders, Clinical Psychology of; Sex-role Development and Education; Sex Therapy, Clinical Psychology of; Sexual Attitudes and Behavior; Sexual Behavior: Sociological Perspective; Sexual Harassment: Legal Perspectives; Sexual Harassment: Social and Psychological Issues; Sexual Orientation and the Law; Sexual Orientation:
Biological Influences; Sexual Orientation: Historical and Social Construction; Sexual Risk Behaviors; Sexuality and Geography; Teen Sexuality; Transsexuality, Transvestism, and Transgender
Bibliography Butler J 1990 Gender Trouble: Feminism and the Subersion of Identity. Routledge, New York Butler J 1997 Critically queer. In: Phelan S (ed.) Playing with Fire: Queer Politics, Queer Theories. Routledge, London de Beauvoir S 1964 The Second Sex. Bantam Books, New York de Lauretis T 1991 Queer theory: lesbian and gay sexualities: an introduction. Differences: Journal of Feminist Cultural Studies 3(2): iii–xviii Delphy C 1993 Rethinking sex and gender. Women’s Studies International Forum 16(1): 1–9 Foucault M 1979 The History of Sexuality. Allen Lane, London, Vol. 1 Gagnon J H, Simon W (eds.) 1967 Sexual Deiance. Harper & Row, London Gagnon J H, Simon W H 1973 Sexual Conduct: The Social Sources of Human Sexuality. Harper & Row, London Jackson S 1996 Heterosexuality and feminist theory. In: Richardson D (ed.) Theorising Heterosexuality. Open University Press, Buckingham, UK MacKinnon C A 1982 Feminism, Marxism, method and the state: an agenda for theory. Signs: Journal of Women in Culture and Society 7(3): 515–44 McIntosh M 1968 The homosexual role. Social Problems 16(2): 182–92 Nicholson L 1994 Interpreting gender. Signs: Journal of Women in Culture and Society 20(1): 79–105 Plummer K 1975 Sexual Stigma: An Interactionist Account. Routledge and Kegan Paul, London Rich A 1980 Compulsory heterosexuality and lesbian existence. Signs: Journal of Women in Culture and Society 5(4): 631–60 Richardson D 1997 Sexuality and feminism. In: Robinson V, Richardson D (eds.) Introducing Women’s Studies: Feminist Theory and Practice, 2nd edn. Macmillan, Basingstoke, UK Richardson D 2000 Re: Thinking Sexuality. Sage, London Rubin G 1984 Thinking sex: notes for a radical theory of the politics of sexuality. In: Vance C S (ed.) Pleasure and Danger: Exploring Female Sexuality. Routledge, London Sedgwick E K 1990 Epistemology of the Closet. University of California Press, Berkeley, CA Wilton T 1996 Which one’s the man? The heterosexualisation of lesbian sex. In: Richardson D (ed.) Theorising Heterosexuality. Open University Press, Buckingham, UK
D. Richardson
Sexuality and Geography Geographical questions concerning sexuality are ideally investigated by combining empirical work with theories of how and why particular sexualities garner spatial force and cultural meaning. In practice, most 14021
Sexuality and Geography theoretical treatments of sexuality in geography have dealt with normative forms of heterosexuality in industrial contexts. Queries about heterosexuality have centered partially on theorizing why and how procreation is given symbolic, spatial, and practical centrality in different cultural and historical contexts. Other theoretical queries have involved analyzing how heterosexual metaphors and expectations inform the structure and meaning of language, discourse, and place. For instance, what is the historical and spatial significance of saying master bedroom, cockpit, capitalist penetration, motherland, virgin territory, mother-earth, or father-sky? In contrast, empirical work has centered on documenting where nonheterosexualities take place, how states and societies regulate nonheterosexual places, bodies, and practices, and the sociospatial difficulties and resistances faced by those who desire nonheterosexual places and identities. Some empirical work also addresses the marginalized places of sex workers in heterosexual sex-work trades. The disparity between theoretical vs. empirical work reflects how knowledge and power intersect spatially. Most persons today are compelled to assume, and are hence familiar with, the bodily and spatial arrangements of heterosexual gestures and public displays of affection, procreational practices (marriages, buying a ‘family’ home, having children) and forms of entertainment. Consequently, little explanatory or descriptive analytical space is needed. In contrast, the bodily and spatial arrangements of those assuming non-normative sexualities are commonly hidden from view and relatively unknown. They therefore require documentation before theorization can proceed. Intersections between geography and sexuality cannot be understood singularly by listing research accomplished to date. Rather, discussion about how geographers have formally and informally navigated sexuality debates is needed to contextualize how external politics have informed internal concerns and research agendas.
1. The Sociology of Sexuality Research in Geography Few geographers have, until recently, explored how sexuality shapes and is shaped by the social and spatial organization (what geographers call the sociospatiality) of everyday life. Fewer still have explored how geography as a discipline is shaped by heterosexual norms and expectations, or what is called heteronormativity. The ways that modern heterosexuality informs everyday life are not seen or studied because it is taken as natural. Hence, most people unconsciously use it to structure the sociospatiality and meaning of their lives, living out and within what Butler (1990) calls a ‘heterosexual matrix.’ Many cultural settings in which heterosexuality predominates place great prac14022
tical and ideological value on procreation. Here, couplings of persons with genitalia qualified as oppositely sexed are presumed to be biologically natural, feeding into culturally constructed ideals of gender identities and relations (WGSG 1997). In modern industrial contexts, boys and girls are expected to grow up to be fathers and mothers who settle into a nuclear household to procreate. Given modern heterosexuality’s centrality in determining what is natural, it commonly grounds notions of ontology (being), epistemology (ways of knowing), and truth. The socially constructed ‘naturalness’ of heterosexuality in modern contexts is produced and reproduced at a variety of geographical scales. Many nation-states, for example, define and regulate heterosexuality as the moral basis of state law and moral order (e.g., The Family), forbidding sexual alliances not ostensibly geared towards procreation (Nast 1998). Such laws and related political, cultural, and epistemological strictures have historically limited the kinds of spatial questions geographers have thought, imagined, or asked. Today, precisely when the existence and sovereignty of nation-states is being challenged by transnational actors so, too, is the relevance of the nuclear family being questioned, suggesting partial societal and political symbiosis between family and state types. Nevertheless, for the time being, heterosexual strictures predominate across sociospatial domains, helping to account for geography’s generalized disciplinary anxiety towards: (a) those who conduct research on, or question, how heterosexuality is sociospatially made the norm, (b) geographers who hold queer identities, and (c) those engaged in queer identity research. The word queer, here, refers to persons practicing non-normative kinds of sexuality. The word is not meant to efface sexuality differences, but to stress the oppositional contexts through which those who are non-normative are made marginal (see Elder et al. in press). By the late 1980s geographers in significant numbers had brought analytical attention to the violence involved in sustaining what Rich (1980) has named, ‘compulsory heterosexuality’ (e.g., Foord and Gregson 1986, Bell 1991, Valentine 1993a, 1993b, WGSG 1997, Nast 1998). Part of this violence has to do with the fact that most geographic research fails to question how heterosexuality is spatially channeled to shape the discipline and everyday life. Heterosexuality’s incorporation into the micropractices and spaces of the discipline means that heteronormativity dominates the ways geographers investigate space. Witness the sustained heterosexist practices and spaces of geography departments and conferences (e.g., Chouinard and Grant 1995, Valentine 1998), heterosexist geographical framings or epistemologies of space, nature, and landscape, and the discipline’s heterosexist empirical interests, divisions, and focii (e.g., Rose 1993, Binnie 1997a). Even
Sexuality and Geography feminist geographical analyses rarely show how constructions of gender relate to the heterosexual practices to which they are attached Foord and Gregson 1986, Townsend in WGSG 1997, p. 54). In these ways, heterosexuality underpins geographical notions of self, place, sex, and gender (Knopp 1994). Consequently, geographers reproduce hegemonic versions of hetero sex and oedipal (or, nuclear) family life, helping to make these appear legitimate, natural, morally right, and innocent.
2. Theoretical Research on Heterosexuality and Geography Perhaps the first geographers to theorize heterosexuality’s impact were Foord and Gregson (1986). Using a realist theoretical framework, they argued that gender divisions of male and female derive from procreation and, hence, heterosexuality. Later, Rose (1993) demonstrated how the language, interests, methods, theories, and practices of certain subfields in geography (particularly time geography and humanistic geography) bely a masculinity that, respectively, ignores women’s lives or renders them nostaglic objects of desire, not unlike how landscapes and nature have been romanticized and feminized in the discipline. Yet the masculinity Rose describes in heterosexual; it is heteromasculinity. Heteromasculinity can additionally be argued to reside in epistemologies and empirical focii of geographic research generally (Knopp 1994), especially physical geography and some Marxian analyses of space. Nast (1998) theorizes that modern heteromasculinity is not a singular entity set in structural opposition to femininity asin traditional binary theories of gendered binaries. Rather, the heteromasculine is structured along two imagined, symbolized, and enaced avenues: (a) the filial (son-like) and (b) the paternal. The filial is constructed as hyperembodied and hence celebrates-as-it-constructs men’s superior physical strength and courage. The paternal is different in that it celebrates men’s superior intellectual strength, objectivity, and cunning and is written about as though disembodied. These two masculinities are antagonistically inter-related in an imaginary and symbolic framework supportive of the prototypical industrial, oedipal family. In the context of globalization and nationalisms, Gibson-Graham (1999) and Nast (1998), respectively, theorize how heterosexualities and geographies discursively and practically make one another. GibsonGraham analyzes the striking rhetorical parallels between some Marxian theoretical constructions of capital and some social constructions of rape that cast women as victims. Thus, many Marxian theories script capitalism as violently irrepressible, a force that destroys noncapitalist social relations with which it
comes into contact. Such theorization gathers force from heterosexualized language and imagery, such that capitalism and its effects are rendered naturally and forever singular, hard, rapacious, and spatially penetrative. Analogously, raped women have commonly been scripted as victims of men’s naturally and singularly superior strength and virility. Rape, once initiated, is judged as unstoppable or inevitable. Women are therefore encouraged to submit (like noncapitalist groups) to heteromasculinity’s logicviolence. For Gibson-Graham, what needs to be resisted is not the ‘reality’ that women and markets are naturally rapable or that men and capitalism are monolithically superior and rapacious. Rather, imaginaries and discursive formations need to be deployed that allow for difference and agency and that support, facilitate, and valorize resistance and change. Nast (1998) drawing upon Foucaultian notions of discourse as useful, similarly shows how heteronormativity is dispersed across sociospatial circuits of everyday life, including discourses and practices of the body, nation, transnation, and world. Focusing on modern nationalist language and imagery, she shows that many representations of nation-states depend upon images of a pure maternal and pure nuclear family. In these instances, heteromasculine control over women’s bodies and procreation is commonly linked to eugenics or racialized notions of national purity. Obversely, some nation-states, particularly fascist ones, are represented using phallic language and imagery that glorify the oedipal (nuclear) family father. Cast as superior protector of family and nation, he and his sons defend against that deemed weak and polluting, including other ‘races’ and the maternal. Nuclear familial imagery and practices have also been used in the context of industrial interests providing, for example, practical and aesthetic means for systematically alienating and reproducing labor. Here, the nuclear family is a core setting for producing and socializing industrial workers, with the family and family home fetishized as natural protected places of privacy and peace.
3. Empirical Research on Sexuality and Geography in Sociospatial Context Most sexuality research in geography consists of empirical studies of sexuality. In many cases the research is undertheorized, partially reflecting the nascent state of sexuality research wherein specificities of sexualities are mapped out before they are theorized across contexts. Many of those documenting the spatial contours of sexualized bodies and practices have focused on: (a) marginalized heterosexuals, such as sex workers, and (b) queer bodies and places. The former is heteronormative to the degree that the researchers do not specify that it is heterosexuality 14023
Sexuality and Geography with which they work; they also do not theorize heterosexuality’s specific sociospatial force in the contexts they study. Consequently their work reinforces popular senses that when one speaks of sex, it is naturally and singularly heterosex. In contrast, working against the grain of heterosexual norms and [expectations] because they are researchers who study the spatiality of queer bodies and places call attention to the specific sexuality(ies) through and against which they work. Perhaps because in some postindustrial contexts the nuclear family and procreational economies went into decline in the 1970s (alongside the Fordist family), social movements involving queer sexualities obtained greater popular recognition and political and material might, especially where related to gay men. It was at this time that a first wave of queer geographical work broke surface, dealing primarily with white gay male investors in urban contexts and drawing largely upon Marxian theories of rent. Many of these works were unpublished and bore descriptive titles (but see Weightman 1980; for unpublished works, see Elder et al. in press). In tandem with increased scholarly production was an increase in the scholarly isibility of queer geographers (see Elder et al. in press). Evidence of increased networking among sexuality researchers is Mapping Desire, the first edited collection on sexuality andgeography(BellandValentine1995).Thecollection, like other works produced towards the late 1980s and 1990s, reflects broad geographical interests, ranging from heterosexual prostitution in Spain, to state surveillance and disciplinary actions towards Black men and women in apartheid South Africa, to the ‘unsafeness’ of the traditional nuclear family home for many lesbians. The works consists mostly of case studies detailing how marginalized identities are lived out, negotiated, and contested. Ironically, given the collection’s title, no spatial theories of desire are presented. Institutional recognition of queer geographers or sexuality research was not sought in English-speaking contexts until the 1990s and only then, in the United States and Britain. In the 1980s, members of the Association of American Geographers (AAG) formed the Lesbigay Caucus, subsequently holding organizational meetings at annual AAG meetings and cosponsoring sessions. In 1992 a small group of British geographers created The Sexuality and Space Network, a research and social network affilated loosely with the Institute of British Geographers. That same year, members of the Lesbigay Caucus decided to work for creation of a Sexuality and Space Specialty Group within the AAG which, unlike the caucus, would sponsor sessions and research committed to exploring inter-relations between sexuality and space. The Specialty Group was granted official status in 1996, becoming the first group of its kind in the discipline. 14024
Empirical work on sexuality and geography is gendered, especially in the context of queer research. Most scholars writing about lesbian spaces are women documenting geographies of women’s fear and discrimination and how these are resisted at different scales and in different contexts. In contrast, most researchers addressing gay male spaces are men. In contrast to lesbian-related research, that related to gay men speaks mostly to proactive market measures successfully taken in securing urban-based private property and terrorities, particularly queer cultural districts produced through large investments in real estate and small business. In either case, western contexts prevail. Thus, Patricia Meono-Picado (Jones et al. 1997) documents how Latina lesbians in New York used spatial tactics to oppose the homophobic programming of a Spanish-language radio station, Valentine (1993b) and McDowell (in Bell and Valentine 1995) chart how some lesbians tenuously negotiate their identities in the workplace, Rothenberg (in Bell and Valentine 1995) discusses the informal networks lesbians use to create community and safety, and Johnston (1998) speaks of the importance of gyms as safe public spaces for alternative female embodiments. Valentine also discusses discriminations faced by lesbians everyday (1993a) and the considerable diversity within and among lesbian communities. (in Jones et al. 1997). Finally, Chouinard and Grant (1995) argue against marginalization of disabled and lesbian women in the academy, comparing their exclusions as disabled and lesbian women, respectively. In contrast, the urban market emphasis present in much gay male research is partly evident in titles such as, ‘The Broadway Corridor: Gay Businesses as Agents of Reitalization in Long Beach, California’ (Ketteringham 1983 cited in Elder et al. in press). The title secondarily speaks to a phenomenon present in much research, particularly about gay male spaces, namely, the exclusionary use of the word ‘gay’. While a study may pertain to gay men only, the word ‘gay,’ like heteropatriachal designations of ‘man,’ is commonly deployed as though it speaks for all queer communities. It thereby obliterates the specificities and identities of those not gay-male-identified (Chouinard and Grant (1995). Perhaps the most prominent and prolific scholar pioneering research into urban markets and queer life is Larry Knopp. Though his work centers on gay male gentrification, he has consistently attempted to theorize gendered spatial differences in queer opportunity structures and life (e.g., Knopp 1990, Lauria and Knopp 1985). Gay men’s desires and differential abilities to procure spatial security at scales and densities greater than that obtained by lesbians (for example, Boystown in Chicago, the Castro district in San Francisco, the West End in Vancouver, Mykonos and Korfu in Greece, and cruising areas for western gay men in Bangkok) have been a source of scholarly debate.
Sexuality and Geography Castells (1983) argued early on, for example, that urban-based gay men form distinct neighborhoods because, as men, they are inherently more territorial. Adler and Brennan (1992) disagree, contending that lesbians, as women, are relatively disadvantaged economically and more prone to violence and attack. Consequently, they are less able to secure territory, legitimacy, visibility (see also Knopp 1990). More recent analyses have taken a comparative or cross-cultural look at queer life, entertaining both greater analytical specificity and diversity along lines of gender, ‘race,’ disability, and\or national or rural– urban location (e.g., Brown 2000, Bell and Valentine 1995, Chouinard and Grant 1995). There is also nascent research on state regulation of sexuality, national identity, and citizenship (e.g., Binnie 1997b, Nast 1998) and on intersections of sexuality and the academy (JGHE 1999). Despite the increased presence of queer research and researchers in the discipline, sexuality research, especially that which deconstructs heterosexuality or which makes queer sexuality visible, is still considered radical (Chouinard and Grant 1995, Valentine 1998, JGHE 1999). Negative reactions point to anxieties over research that expose heterosexuality’s artifice, research that implicitly argues against the privileges and privileging power relations upon which heterosexual ontologies of truth and gender are grounded. The depth of fear is made poignantly clear in a recent article by Valentine (1998) about the ongoing homophobic harassment levied at her by someone in the discipline who collapses her sexuality (she is a lesbian) with her sexuality research on lesbians, explicitly naming both a moral abomination.
4. Future Research A number of areas for future research suggest themselves. First, theorization is needed in addressing sexuality–geography links. What kinds of spatial and symbolic work do normative and queer sexualities practically accomplish across historical and cultural place? More work is needed that explores how normative heterosexualities vary across time and place, in keeping with diverse sociocultural expressions, uses and values of kinship and family structures. Moreover, the nuclear family is apparently in decline unevenly across place and time, in keeping with shifts in places of industrialization, begging the question as to whether some sexualities and affective familial patterns are better suited to certain political economies than to others (see also Knopp 1994). If this is the case, what kinds of sexual identities and desires work well in postindustrial places, and how are industrial restructuring processes affecting oedipal family life and the commoditization of normative and oppositionally sexed identities? What sorts of alternative sociospatial alliances are needed to allow diverse sexualities to co-
exist? Are gay white men between the ages of 25 and 40 ideally situated economically and culturally to take advantage of the decline of the nuclear family in postindustrial contexts, which do not depend as much upon, or value, procreation? Second, sexuality–geography research needs to be more theoretically and empirically attuned to differences produced through normative and oppositional constructions of race, gender, class, disability, age, religious beliefs, and nation. Little gender research has been done in the context of queer communities, for example. Are sociospatial interactions between gay male and lesbian communities devoid of gendered tensions? Does patriarchy disappear in spaces of gay male communities (Chouinard and Grant 1995, Knopp 1990)? Are lesbians or gay men situated outside the signifying strictures of heteronormative codes of femininity and masculinity? And how are constructions of race expressed sexually and geographically? Much work needs to be done on how persons of color have been put into infantilized, subordinate positions and places associated with the white-led Family. African-American men after emancipation, for example, were consistently constructed as rapist-sons sexually desirous of the white mother. What kind of work do these and other similarly sexed constructions accomplish in colonial and neocolonial contexts (Nast 2000)? Similarly, why does sex with young boys and men of color, particularly in Southeast Asia, have such wide market appeal among privileged gay white men, resulting in considerable sexualized tourism investment? And what kind of cultural, political, and economic work is achieved through sado-masochistic constructions threaded across sexed and raced communities? What precisely is being constructed and why? Finally, how do sociospatial debates about, and constructions of sexuality, continue to shape the theories, practices, and empirical concerns of geography? See also: Gender and Environment; Gender and Place; Queer Theory; Sexuality and Gender
Bibliography Adler S, Brennan J 1992 Gender and space: Lesbians and gay men in the city. International Journal of Urban and Rural Research 16: 24–34 Bell D J 1991 Insignificant others: Lesbian and gay geographies. Area 23: 323–9 Bell D, Valentine G (eds.) 1995 Mapping Desire. Routledge, New York Binnie J 1997a Coming out of geography: Towards a queer epistemology? Enironment and Planning D: Society and Space 15: 223–37 Binnie J 1997b Invisible Europeans: Sexual citizenship in the New Europe. Enironment and Planning A 29: 237–48 Brown M P 2000 Closet Geographies. Routledge, New York Butler J P 1990 Gender Trouble Routledge, New York
14025
Sexuality and Geography Castells M 1983 The City and the Grassroots. University of California Press, Berkeley, CA Chouinard V, Grant A 1995 On being not even anywhere near ‘‘the project’’: Ways of putting ourselves in the picture. Antipode 27: 137–66 Elder G, Knopp L, Nast H in press Geography and sexuality. In: Gaile G, Willmott C (eds.) Geography in America at the Dawn of the 21st Century. Oxford University Press, Oxford, UK Foord J, Gregson N 1986 Patriarchy: towards a reconceptualisation. Antipode 18: 186–211 Gibson-Graham J K 1998 Queerying globalization. In: Nast H, Pile S (eds.) Places Through the Body. Routledge, New York Johnston L 1998 Reading the sexed bodies and spaces of gyms. In: Nast H J, Pile S (eds.) Places Through the Body. Routledge, New York Jones J P III, Nast H J, Roberts S H (eds.) 1997 Thresholds in Feminist Geography. Rowman and Littlefield, Lanham, MD JGHE Symposium: Teaching Sexualities in Geography 1999 The Journal of Geography in Higher Education 23: 77–123 Knopp L 1990 Some theoretical implications of gay involvement in an urban land market. Political Geography Quarterly 9: 337–52 Knopp L 1994 Social justice, sexuality, and the city. Urban Geography 15: 644–60 Lauria M, Knopp L 1985 Towards an analysis of the role of gay communities in the urban renaissance. Urban Geography 6: 152–69 Nast H 1998 Unsexy geographies. Gender, Place and Culture 5: 191–206 Nast H J 2000 Mapping the ‘unconscious’: Racism and the oedipal family. Annals of the Association of American Geographers 90(2): 215–55 Rich A 1980 Compulsory heterosexuality and lesbian existence. Signs 5: 631–60 Rose G 1993 Feminism and Geography. University of Minnesota Press, Minneapolis, MN Valentine G 1993a (Hetero)sexing space: Lesbian perceptions and experiences of everyday spaces. Enironment and Planning D: Society and Space 11: 395–413 Valentine G 1993b Negotiating and managing multiple sexual identities: Lesbian time-space strategies. Transactions of the Institute of British Geographer 18: 237–48 Valentine G 1998 ‘‘Sticks and Stones May Break my Bones’’: A personal geography of harassment. Antipode 30: 305–32 Weightman B 1980 Gay bars as private places. Landscape 24: 9–17 Women and Geography Study Group (WGSG) 1997 Feminist Geographies: Explorations in Diersity and Difference. Longman, Harlow, UK
H. J. Nast
Sexually Transmitted Diseases: Psychosocial Aspects Sexually transmitted diseases (STDs) including HIV place an enormous burden on the public’s health. In 1995, the World Health Organization (WHO) estimated that, worldwide, there were 333 million new 14026
cases of chlamydia, gonorrhea, syphilis, and trichomoniasis in 15- to 49-year-olds. In 1999, WHO estimated that there were 5.6 million new HIV infections worldwide, and that 33.6 million people were living with AIDS. Although it is not who you are, but what you do that determines whether you will expose yourself or others to STDs including HIV, STDs disproportionately affect various demographic groupings. For example, STDs are more prevalent in urban settings, in unmarried individuals, and in young adults. They are also more prevalent among the socioeconomically disadvantaged and various ethnic groups. For example, in 1996 in the United States, the incidence of reported gonorrhea per 100,000 was 826 among black non-Hispanics, 106 among Native Americans, 69 among Hispanics, 26 among white nonHispanics, and 18.6 among Asian\Pacific Islanders (Aral and Holmes 1999). Demographic differences in STD rates are most likely explained by differences in sexual behaviors or disease prevalence. Thus, demographic variables are often referred to as risk markers or risk indicators. In contrast, sexual and healthcare behaviors that directly influence the probability of acquiring or transmitting STDs represent true risk factors. From a psychosocial perspective, it is these behavioral risk factors (and not risk markers) that are critical for an understanding of disease transmission (see Sexual Risk Behaiors).
1. Transmission Dynamics To understand how behavior contributes to the spread of an STD, consider May and Anderson’s (1987) model of transmission dynamics: Ro l βcD, where, Ro l Reproductive Rate of Infection. When Ro is greater than 1, the epidemic is growing; when Ro is less than 1, the epidemic is dying out; and when Ro l 1, the epidemic is in a state of equilibrium, β l measure of infectivity or tranmissability, c l measure of interaction rates between susceptibles and infected individuals, and D l measure of duration of infectiousness Each of the parameters on the right hand side of the equation can be influenced by behavior. For example, the transmission rate (β) can be lowered by increasing consistent and correct condom use or by delaying the onset of sexual activity. Transmissibility can also be reduced by vaccines, but people must utilize these vaccines and before that, it is necessary for people to have participated in vaccine trials. Decreasing the rate of new partner acquisition will reduce the sexual interaction rate c and, at least for bacterial STDs, duration of infectiousness D can be reduced through the detection of asymptomatic or through early treatment of symptomatic STDs. Thus, increasing care seeking behavior and\or increasing the likelihood that one will participate in screening programs can affect the reproductive rate. Given that STDs are
Sexually Transmitted Diseases: Psychosocial Aspects important cofactors in the transmission of HIV, early detection and treatment of STDs will also influence HIV transmissibility (β). Finally, D can also be influenced by patient compliance with medical treatment, as well as compliance with partner notification. Clearly, there are a number of different behaviors that, if changed or reinforced, could have an impact on the reproductive rate Ro of HIV and other STDs. A critical question is whether it is necessary to consider each of these behaviors as a unique entity, or whether there are some more general principals that can guide our understanding of any behavior. Fortunately, even though every behavior is unique, there are only a limited number of variables that need to be considered in attempting to predict, understand, or change any given behavior.
2. Psychosocial Determinants of Intention and Behaior Figure 1 provides an integration of several different leading theories of behavioral prediction and behavior change (cf. Fishbein et al. 1992). Before describing this model, however, it is worth noting that theoretical models such as the one presented in Fig. 1 have often been criticized as ‘Western’ or ‘US’ models that don’t apply to other cultures or countries. When properly applied, however, these models are culturally specific, and they require one to understand the behavior from the perspective of the population being considered. Each of the variables in the model can be found in
almost any culture or population. In fact, the theoretical variables contained in the model have been assessed in over 50 countries in both the developed and the developing world. Moreover, the relative importance of each of the variables in the model is expected to vary as a function of both the behavior and the population under consideration (see Health Behaior: Psychosocial Theories).
2.1 Determinants of Behaior Looking at Fig. 1, it can be seen that any given behavior is most likely to occur if one has a strong intention to perform the behavior, if one has the necessary skills and abilities required to perform the behavior, and if there are no environmental constraints preventing behavioral performance. Indeed, if one has made a strong commitment (or formed a strong intention), has the necessary skills and abilities, and if there are no environmental constraints to prevent behavioral performance, the probability is close to one that the behavior will be performed (Fishbein et al. 1992). Clearly, very different types of interventions will be necessary if one has formed an intention but is unable to act upon it, than if one has little or no intention to perform the behavior in question. Thus, in some populations or cultures, the behavior may not be performed because people have not yet formed intentions to perform the behavior, while in others, the problem may be a lack of skills and\or the presence of
Figure 1 An integrative model
14027
Sexually Transmitted Diseases: Psychosocial Aspects environmental constraints. In still other cultures, more than one of these factors may be relevant. For example, among female commercial sex workers (CSWs) in Seattle, Washington, only 30 percent intend to use condoms for vaginal sex with their main partners, and of those, only 40 percent have acted on their intentions (von Haeften et al. 2000). Clearly if people have formed the desired intention but are not acting on it, a successful intervention will be directed either at skills building or will involve social engineering to remove (or to help people overcome) environmental constraints.
2.2 Determinants of Intentions On the other hand, if strong intentions to perform the behavior in question have not been formed, the model suggests that there are three primary determinants of intention: the attitude toward performing the behavior (i.e., the person’s overall feelings of favorableness or unfavorableness toward performing the behavior), perceived norms concerning performance of the behavior (including both perceptions of what others think one should do as well as perceptions of what others are doing), and one’s self-efficacy with respect to performing the behavior (i.e., one’s belief that one can perform the behavior even under a number of difficult circumstances) (see Self-efficacy and Health). As indicated above, the relative importance of these three psychosocial variables as determinants of intention will depend upon both the behavior and the population being considered. Thus, for example, one behavior may be determined primarily by attitudinal considerations while another may be influenced primarily by feelings of self-efficacy. Similarly, a behavior that is driven attitudinally in one population or culture may be driven normatively in another. Thus, before developing interventions to change intentions, it is important first to determine the degree to which that intention is under attitudinal, normative, or selfefficacy control in the population in question. Once again, it should be clear that very different interventions are needed for attitudinally controlled behaviors than for behaviors that are under normative influence or are related strongly to feelings of selfefficacy. Clearly, one size does not fit all, and interventions that are successful in one culture or population may be a complete failure in another.
2.3 Determinants of Attitudes, Norms, and Selfefficacy The model in Fig. 1 also recognizes that attitudes, perceived norms, and self-efficacy are all, themselves, functions of underlying beliefs—about the outcomes of performing the behavior in question, about the normative proscriptions and\or behaviors of specific 14028
referents, and about specific barriers to behavioral performance. Thus, for example, the more one believes that performing the behavior in question will lead to ‘good’ outcomes and prevent ‘bad’ outcomes, the more favorable one’s attitude toward performing the behavior. Similarly, the more one believes that specific others think one should (or should not) perform the behavior in question, and the more one is motivated to comply with those specific others, the more social pressure one will feel (or the stronger the norm) with respect to performing (or not performing) the behavior. Finally, the more one perceives that one can (i.e., has the necessary skills and abilities to) perform the behavior, even in the face of specific barriers or obstacles, the stronger will be one’s self-efficacy with respect to performing the behavior. It is at this level that the substantive uniqueness of each behavior comes into play. For example, the barriers to using and\or the outcomes (or consequences) of using a condom for vaginal sex with one’s spouse or main partner may be very different from those associated with using a condom for vaginal sex with a commercial sex worker or an occasional partner. Yet it is these specific beliefs that must be addressed in an intervention if one wishes to change intentions and behavior. Although an investigator can sit in their office and develop measures of attitudes, perceived norms, and self-efficacy, they cannot tell you what a given population (or a given person) believes about performing a given behavior. Thus, one must go to members of that population to identify salient behavioral, normative, and efficacy beliefs. One must understand the behavior from the perspective of the population one is considering.
2.4 The Role of ‘External’ Variables Finally, Fig. 1 also shows the role played by more traditional demographic, personality, attitudinal, and other individual difference variables (such as perceived risk). According to the model, these types of variables primarily play an indirect role in influencing behavior. For example, while men and women may hold different beliefs about performing some behaviors, they may hold very similar beliefs with respect to others. Similarly rich and poor, old and young, those from developing and developed countries, those who do and do not perceive they are at risk for a given illness, those with favorable and unfavorable attitudes toward family planning, etc., may hold different attitudinal, normative or self-efficacy beliefs with respect to one behavior but may hold similar beliefs with respect to another. Thus, there is no necessary relation between these ‘external’ or ‘background’ variables and any given behavior. Nevertheless, external variables such as cultural and personality differences should be reflected in the underlying belief structure.
Sexually Transmitted Diseases: Psychosocial Aspects
3. The Role of Theory in HIV\STD Preention Models, like the one represented by Fig. 1, have served as the theoretical underpinning for a number of successful behavioral interventions in the HIV\STD arena (cf. Kalichman et al. 1996). For example, the US Centers for Disease Control and Prevention (CDC) have supported two large multisite studies based on the model in Fig. 1. The first, the AIDS Community Demonstration Projects (Fishbein et al 1999), attempted to reach members of populations at risk for HIV\STD that were unlikely to come into contact with the health department. The second, Project RESPECT, was a multisite randomized controlled trial designed to evaluate the effectiveness of HIV\STD counseling and testing (Kamb et al. 1998). Project RESPECT asked whether prevention counseling or enhanced prevention counseling were more effective in increasing condom use and reducing incident STDs than standard education. Although based on the same theoretical model, these two interventions were logistically very different. In one, the intervention was delivered ‘in the street’ by volunteer networks recruited from the community. In the other, the intervention was delivered one-on-one by trained counselors in an STD clinic. Thus, one involved community participation and mobilization while the other involved working within established public health settings. In addition, one was evaluated using the community as the unit of analysis while the other looked for behavior change at the individual level. Despite these logistic differences, both interventions produced highly significant behavioral change. In addition, in the clinic setting (where it was feasible to obtain biologic outcome measures), the intervention also produced a significant reduction in incident STDs. The success of these two interventions is due largely to their reliance on established behavioral principles. More important, it appears that theory-based approaches that are tailored to specific populations and behaviors can be effective in changing STD\HIV related behaviors in different cultures and communities (cf. NIH 1997).
4. The Relation between Behaioral and Biological Outcome Measures Unfortunately, behavior change (and in particular, self-reported behavior change), is often viewed as insufficient evidence for disease prevention, and several investigators have questioned the validity of self-reports of behavior and their utility as outcome measures in HIV\STD Prevention research. For example, Zenilman et al. (1995) investigated the relationship between STD clinic patients self-reported condom use and STD incidence. Patients coming to an STD clinic in Baltimore, Maryland, who agreed to participate in the study were examined for STDs upon
entry and approximately 3 months later. At the time of the follow-up exam, participants were asked to report the number of times they had sex in the past month and the number of times they had used a condom while having sex. Those who reported 100 percent condom use were compared with those who reported sometime use or no use of condoms with respect to incident (or new) STDs. Zenilman et al. (1995) found no significant relationship between self-reported condom use and incident STDs. Based on this finding, they questioned the veracity of the self-reports and suggested that intervention studies using self-reported condom use as the primary outcome measure were at best suspect and at worst, invalid. Such a view fails to recognize the complex relationship between behavioral and biological measures. 4.1 Transmission Dynamic Reisited Consider again the May and Anderson (1987) model of transmission dynamics: Ro l βcD. It is important to recognize that the impact on the reproductive rate, of a change in any one parameter, will depend upon the values of the other two parameters. Thus, for example, if one attempted to lower the reproductive rate of HIV by reducing transmission efficiency (either by reducing STDs or by increasing condom use), the impact of such a reduction would depend upon both the prevalence of the disease in the population and the sexual mixing patterns in that population. Clearly, if there is no disease in the population, decreases (or increases) in transmission efficiency can have very little to do with the spread of the disease. Similarly, a reduction in STD rates or an increase in condom use in those who are at low risk of exposure to partners with HIV will have little or no impact on the epidemic. In contrast, a reduction in STDs or an increase in condom use in those members of the population who are most likely to transmit and\or acquire HIV (that is, in the so-called core group), can, depending upon prevalence of the disease in the population, have a big impact on the epidemic (cf. Pequegnat et al. 2000). To complicate matters further, it must also be recognized that changes in one parameter may directly or indirectly influence one of the other parameters. For example, at least some people have argued that an intervention program that increased condom use successfully, could also lead to an increase in number of partners (perhaps because now one felt somewhat safer). If this were in fact the case, an increase in condom use would not lead necessarily to a decrease in the reproductive rate. In other words, the impact of a given increase (or decrease) in condom use on STD\ HIV incidence will differ, depending upon the values of the other parameters in the model. In addition, it’s important to recognize that condom use behaviors are very different with ‘safe’ than with ‘risky’ partners. For example, condoms are used much more frequently 14029
Sexually Transmitted Diseases: Psychosocial Aspects with ‘occasional’ or ‘new’ partners than with ‘main’ or ‘regular’ partners (see e.g., Fishbein et al 1999). Thus, one should not expect to find a simple linear relation (i.e., a correlation) between decreases in transmission efficiency and reductions in HIV seroconversions. Moreover, it should be recognized that many other factors may influence transmission efficiency. For example, the degree of infectivity of the donor; characteristics of the host; and the type and frequency of sexual practices all influence transmission efficiency; and variations in these factors will also influence the nature of a relationship between increased condom use and the incidence of STDs (including HIV). In addition, although correct and consistent condom use can prevent HIV, gonorrhea, syphilis, and probably chlamydia, condoms are less effective in interrupting transmission of herpes and genital warts. Although one is always better off using a condom than not using a condom, the impact of condom use is expected to vary by disease. Moreover, for many STDs, transmission from men to women is much more efficient than from women to men. For example, with one unprotected coital episode with a person with gonorrhea, there is about a 50 to 90 percent chance of transmission from male to female, but only about a 20 percent chance of transmission from female to male. It should also be noted that one can acquire an STD even if one always uses a condom. Consistent condom use is not necessarily correct condom use, and incorrect condom use and condom use errors occur with surprisingly high frequencies (Fishbein and Pequegnat 2000, Warner et al. 1998). In addition, at least some ‘new’ or incident infections may be ‘old’ STDs that initially went undetected or that did not respond to treatment. Despite these complexities, it is important to understand when, and under what circumstances, behavior change will be related to a reduction in STD incidence. Unfortunately, it is unlikely that this will occur until behaviors are assessed more precisely and new (or incident) STDs can be more accurately identified. From a behavioral science or psychosocial perspective, the two most pressing problems and the greatest challenges are to assess correct, as well as consistent, condom use, and to identify those at high or low risk for transmitting or acquiring STDs (see HIV Risk Interentions).
4.2 Assessing Correct and Consistent Condom Use Condom use is most often assessed by asking respondents how many times they have engaged in sex and then asking them to indicate how many of these times they used a condom. One will get very different answers to these questions depending upon the time frame used (lifetime, past year, past 3 months, past month, past week, last time), and the extent to which one distinguishes between the type of sex (vaginal, 14030
anal, or oral) and type of partner (steady, occasional, paying client). Irrespective of the time frame, type of partner or type of sex, these two numbers (i.e., number of sex acts and number of times condoms were used) can be used in at least two very different ways. In most of the literature (particularly the social psychological literature), the most common outcome measure is the percent of times the respondent reports condom use (i.e., for each subject, one divides the number of times condoms were used by the number of sex acts). Perhaps a more appropriate measure would be to subtract the number of time condoms were used from the number of sex acts. This would yield a measure of the number of unprotected sex acts. Clearly, if one is truly interested in preventing disease or pregnancy, it is the number of unprotected sex acts and not the percent of times condoms are used that should be the critical variable. Obviously, there is a difference in one’s risk of acquiring an STD if one has sex 1,000 times and uses a condom 900 times than if one has sex 10 times and uses a condom 9 times. Both of these people will have used a condom 90 percent of the time, but the former will have engaged in 100 unprotected sex acts while the latter will have engaged in only 1. By considering the number of unprotected sex acts rather than the percent of times condoms are used, it becomes clear how someone who reports always using a condom can get an STD or become pregnant. To put it simply, consistent condom use is not necessarily correct condom use, and incorrect condom use almost always equates to unprotected sex.
4.3 How Often Does Incorrect Condom Use Occur? There is a great deal of incorrect condom use. For example, Warner et al. (1998) asked 47 sexually active male college students (between 18 and 29 years of age) to report the number of times they had vaginal intercourse and the number of times they used condoms in the last month. In addition the subjects were asked to quantify the number of times they experienced several problems (e.g., breakage, slippage) while using condoms. Altogether, the 47 men used a total of 270 condoms in the month preceding the study. Seventeen percent of the men reported that intercourse was started without a condom, but they then stopped to put one on; 12.8 percent reported breaking a condom during intercourse or withdrawal; 8.5 percent started intercourse with a condom, but then removed it and continued intercourse; and 6.4 percent reported that the condom fell off during intercourse or withdrawal. Note that all of these people could have honestly reported always using a condom, and all could have transmitted and\or acquired an STD! Similar findings were obtained from both men and women attending STD clinics in the US (Fishbein and Pequegnat 2000). For example, fully 34 percent of
Sexually Transmitted Diseases: Psychosocial Aspects women and 36 percent of men reported condom breakage during the past 12 months, with 11 percent of women and 15 percent of men reporting condom breakage in the past 3 months. Similarly, 31 percent of the women and 28 percent of the men reported that a condom fell off in the past 12 months while 8 percent of both men and women report slippage in the past 3 months. Perhaps not surprisingly women are significantly more likely than men to report condom leakage (17 percent vs. 9 percent). But again, the remarkable findings are the high proportion of both men and women reporting different types of condom mistakes. For example, 31 percent of the men and 36 percent of the women report starting sex without a condom and then putting one on, while 26 percent of the men and 23 percent of the women report starting sex with a condom and then taking it off and continuing intercourse. This probably reflects the difference between using a condom for family planning purposes and using one for the prevention of STDs including HIV. The practice of having some sex without a condom probably reflects incorrect beliefs about how one prevents pregnancy. Irrespective of the reason for these behaviors, the fact remains that all of these ‘errors’ could have led to the transmission and\or acquisition of an STD despite the fact that condoms had, in fact, been used. In general, and within the constraints of one’s ability to accurately recall past events, people do appear to be honest in reporting their sexual behaviors including their condom use. Behavioral scientists must obtain better measures, not of condom use per se, but of correct and consistent condom use, or perhaps even more important, of the number of unprotected sex acts in which a person engages.
4.4 Sex With ‘Safe’ and ‘Risky’ Partners Whether one uses a condom correctly or not is essentially irrelevant as long as one is having sex with a safe (i.e., uninfected) partner. However, to prevent the acquisition (or transmission) of disease, correct and consistent condom use is essential when one has sex with a risky (i.e., infected) partner. Although it is not possible to always know whether another person is safe or risky, it seems reasonable to assume that those who have sex with either a new partner, an occasional partner, or a main partner who is having sex outside of the relationship are at higher risk than those who have sex only with a main partner who is believed to be monogamous. Consistent with this, STD clinic patients with potential high risk partners (i.e., new, occasional, or nonmonogamous main partners) were significantly more likely to acquire an STD than those with potential low risk partners (18.5 percent vs. 10.4 percent; Fishbein and Jarvis 2000).
One can also assess people’s perceptions that their partners put them at risk for acquiring HIV. For example, the STD clinic patients were also asked to indicate, on seven-place likely (7)\unlikely (1) scales, whether having unprotected sex with their main and\or occasional partners would increase their chances of acquiring HIV. Not surprisingly those who felt it was likely that unprotected sex would increase their chances of acquiring HIV (i.e., those with scores of 5, 6, or 7) were, in fact, significantly more likely to acquire a new STD (19.4 percent) than those who perceived that their partner(s) did not put them at risk (i.e., those with scores of 1, 2, 3, or 4—STD incidence l 12.6 percent). Somewhat surprising, clients’ perceptions of the risk status of their partners were not highly correlated with whether they were actually with a potentially safe or dangerous partner (r l .11, p .001). Nevertheless, both risk scores independently relate to STD acquisition. More specifically, those who were ‘hi’ on both actual and perceived risk were almost four times as likely to acquire a new STD than were those who are ‘lo’ on both actual and perceived risk. Those with mixed patterns (i.e., hi\lo or lo\hi) are intermediate in STD acquisition (Fishbein and Jarvis 2000). So, it does appear possible to not only find behavioral measures that distinguish between those who are more or less likely to acquire a new STD, but equally important, it appears possible to identify measures that distinguish between those who are, or are not, having sex with partners who are potentially placing them at high risk for acquiring an STD. 4.5 STD Incidence as a Function of Risk and Correct and Consistent Condom Use If this combined measure of actual and perceived risk is ‘accurate’ in identifying whether or not a person is having sex with a risky (i.e., an infected) or a safe (i.e., a noninfected) partner, condom use should make no difference with low-risk partners, but correct and consistent condom use should make a major difference with high-risk partners. More specifically, correct and consistent condom use should significantly reduce STD incidence among those having sex with potential high-risk partners. Consistent with this, while condom use was unrelated to STD incidence among those with low-risk partners, correct and consistent condom use did significantly reduce STD incidence among those at high risk. Not only are these findings important for understanding the relationship between condom use and STD incidence but they provide evidence for both the validity and utility of self-reported behaviors. In addition, these data provide clear evidence that if behavioral interventions significantly increase correct and consistent condom use among people who perceive they are at risk and\or who have a new, an occasional, or a nonmonogamous main partner, there will be a significant reduction in STD incidence. On 14031
Sexually Transmitted Diseases: Psychosocial Aspects the other hand, if, among this high-risk group, we only increase consistency of use without increasing correct use and\or if we only increase condom use among those at low risk, then we will see little or no reduction in STD (or HIV) incidence.
5. Future Directions Clearly it is time to stop using STD incidence as a ‘gold standard’ to validate behavioral self-reports and to start paying more attention to understanding the relationships between behavioral and biological outcome measures. It is also important to continue to develop theory-based, culturally sensitive interventions to change a number of STD\HIV related behaviors. While much of the focus to date has been on increasing consistent condom use, interventions are needed to increase correct condom use and to increase the likelihood that people will come in for screening and early treatment, and, for those already infected, to adhere to their medical regimens. Indeed, it does appear that we now know how to change behavior, and that under appropriate conditions, behavior change will lead to a reduction in STD incidence. While many investigators have called for ‘new’ theories of behavior change, ‘new’ theories are probably unnecessary. What is needed, however, is for investigators and interventionists to better understand and correctly utilize existing, empirically supported theories in developing and evaluating behavior change interventions. See also: Health Behavior: Psychosocial Theories; HIV Risk Interventions; Self-efficacy and Health; Sexual Risk Behaviors; Vulnerability and Perceived Susceptibility, Psychology of
Bibliography Aral S O, Holmes K K 1999 Social and behavioral determinants of the epidemiology of STDs: Industrialized and developing countries. In: Holmes K K, Sparling P F, Mardh P-A, Lemon S M, Stamm W E, Piot P, Wasserheit J N (eds.) Sexually Transmitted Diseases. McGraw-Hill, New York Fishbein M, Higgins D L, Rietmeijer C 1999 Communitylevel HIV intervention in five cities: Final outcome data from the CDC AIDS community demonstration projects. American Journal of Public Health 89(3): 336–45 Fishbein M, Bandura A, Triandis H C, Kanfer F H, Becker M H, Middlestadt S E 1992 Factors Influencing Behaior and Behaior Change: Final Report—Theorist’s Workshop. National Institute of Mental Health, Rockville, MD Fishbein M, Jarvis B 2000 Failure to find a behavioral surrogate for STD incidence: What does it really mean? Sexually Transmitted Diseases 27: 452–5 Fishbein M, Pequegnat W 2000 Using behavioral and biological outcome measures for evaluating AIDS prevention interventions: A commentary. Sexually Transmitted Diseases 27(2): 101–10
14032
Kalichman S C, Carey M P, Johnson B T 1996 Prevention of sexually transmitted HIV infection: A meta-analytic review of the behavioral outcome literature. Annals of Behaioral Medicine 18(1): 6–15 Kamb M L, Fishbein M, Douglas J M, Rhodes F, Rogers J, Bolan G, Zenilman J, Hoxworth T, Mallotte C K, Iatesta M, Kent C, Lentz A, Graziano S, Byers R H, Peterman T A, the Project RESPECT Study Group 1998 HIV\STD prevention counseling for high-risk behaviors: Results from a multicenter, randomized controlled trial. Journal of the American Medical Association 280(13): 1161–7 May R M, Anderson R M 1987 Transmission dynamics of HIV infection. Nature 326: 137–42 NIH Consensus Development Panel 1997 Statement from consensus development conference on interventions to prevent HIV risk behaviors. NIH Consensus Statement 15(2): 1–41 Pequegnat W, Fishbein M, Celantano D, Ehrhardt A, Garnett G, Holtgrave D, Jaccard J, Schachter J, Zenilman J 2000 NIMH\APPC workgroup on behavioral and biological outcomes in HIV\STD prevention studies: A position statement. Sexually Transmitted Diseases 27(3): 127–32 von Haeften I, Fishbein M, Kaspryzk D, Montano D 2000 Acting on one’s intentions: Variations in condom use intentions and behaviors as a function of type of partner, gender, ethnicity and risk. Psychology, Health & Medicine 5(2): 163–71 Warner L, Clay-Warner J, Boles J, Williamson J 1998 Assessing condom use practices: Implications for evaluating method and user effectiveness. Sexually Transmitted Disases 25(6): 273–7 Zenilman J M, Weisman C S, Rompalo A M, Ellish N, Upchurch D M, Hook E W, Clinton D 1995 Condom use to prevent incident STDs: The validity of self-reported condom use. Sexually Transmitted Diseases 22(1): 15–21
M. Fishbein
Shamanism Shamanism is a tradition of part-time religious specialists who establish and maintain personalistic relations with specific spirit beings through the use of controlled and culturally scripted altered states of consciousness (ASC). Shamans employ powers derived from spirits to heal sickness, to guide the dead to their final destinations, to influence animals and forces of nature in a way that benefits their communities, to initiate assaults on enemies, and to protect their own communities from external aggression. Shamans exercise mastery over ASC and use them as a means to the culturally approved end of mediating between human, animal, and supernatural worlds. Shamans draw upon background knowledge, conveyed through myth and ritual, which renders intelligible the potentially chaotic experience of ASC. The criterion of control helps to distinguish shamanism from the use of ASC in other traditions. Shamanism has long been a subject of inquiry and controversy in diverse academic disciplines, with many hundreds of accounts of shamanic practices published
Shamanism by the early 1900s. It has also been a topic of spiritual interest to the wider public in Europe and North America for the past several decades. Depending on the perspective taken, shamanism is either the most archaic and universal form of human spirituality or a culturally distinct religious complex with historical roots in Siberia and a path of diffusion into North and South America. The latter view is adopted in the following.
1. Siberian Origin and Common Features The term ‘shaman’ is drawn from seventeenth century Russian sources reporting on an eastern Siberian people, the Evenk (Tungus). In the classic work on the subject by the historian of religion Mircea Eliade, ‘shamanism’ is used to refer to a complex of beliefs and practices diffused from northern Asia to societies in central and eastern Asia, as well as through all of North and South America. Eliade’s comparative work specifies a core constellation of ideas common to shamanic traditions. The classic shamanic initiation involves the novice’s selection—frequently unwanted and resisted—by a spirit(s), a traumatic and dangerous series of ordeals, followed by a death and rebirth that sometimes involves violent dismembering and subsequent reconstitution of the fledgling shaman’s body. The archetypal shamanic cosmology is vertically tiered, with earth occupying the middle level and a cosmic tree or world mountain serving as a connecting path for shamans to travel to other cosmic planes (up or down) in pursuit of their ‘helping spirits.’ Shamans use a variety of techniques to enter the ASC in which they communicate with their spirit helpers: sensory deprivation (e.g., fasting, meditation), repetitive drumming and\or dancing, and ingestion of substances with psychoactive properties (e.g., plant hallucinogens). The last of these means is particularly important in the shamanism of Central and South America, where an impressive array of plants and animal-derived chemicals have been used for their hallucinogenic properties (e.g., peyote, datura, virola, and poison from the Bufo marinus toad). Among the most commonly reported characteristics of the shaman’s ASC are transformations into animals, and flights to distant places. Shamanic animals include the jaguar (lowland South America), the wolf (North American Chukchee), and bears, reindeer and fish (Lapp). Shamanic flights are undertaken to recapture the wandering souls of the sick, to intercede on behalf of hunters with spirits that control animal species, and to guide the dead to their final destination. The fact that shamans are often called upon to heal the sick should not lead to the conclusion that they are always concerned with the well being of their fellows. While the Hippocratic Oath of Western medicine prohibits physicians from doing harm, shamans fre-
quently engage in actions meant to sicken, if not kill, their adversaries. This is an important corrective to romanticized images of the shamanic vocation.
2. Popular and Professional ‘Shamanism’ Beginning in the 1960s there developed a convergence in the professional and popular interests in shamanism. Inspired by the experimentation with hallucinogenic drugs on the part of American and European youth, attention was drawn to the physiology and psychology of altered states of consciousness. In academic circles, this led to speculations about the neurochemical foundations of hallucinatory experiences and their potential therapeutic benefits. Investigations were undertaken into the imagery of shamanic states of consciousness to determine how much could be attributed to universal, physiologically related sensory experiences. One often cited collection of essays, published in the journal Ethos (Prince 1982), sought to link the shaman’s ASC to the production of the naturally occurring opiates known as endorphins. Scholars argued that this might account for the healing effect of shamanic therapies. In the popular cultures of North America and Europe, stress came to be placed on shamanism as an avenue to self-exploration, a means by which persons from any culture could advance their quest for spiritual understandings. The legacy of this development is to be found in the forms of ‘neoshamanism’ among New Age religions, which draw eclectically from more traditional shamanic practices. Native North and South American societies have been special sources of inspiration for practitioners of neoshamanism, but not always with the willing support of the indigenous people. At the same time, and in response to the evolution of research in symbolic anthropology, anthropologists have contributed increasingly sophisticated ethnographic analyzes of the metaphysical underpinnings of shamanism in specific societies. These anthropologists concentrate their attention on the complexities of shamanic cosmologies and the relationships between shamanic symbolism and crucial features of the natural world (e.g., important animal and plant species, meteorological phenomena, and geographic landmarks). An especially detailed analysis of a shamanic tradition comes from the Warao of the Orinoco Delta in Venezuela. The German-born and American-trained anthropologist Johannes Wilbert documents the mythic cosmology that supports the work of three distinct kinds of shamans, each of which are responsible for the care and feeding of specific gods. The highest ranking shaman, the Wishiratu (‘Master of Pains’ or Priest Shaman), responds when the Kanobos (Supreme Beings) send sickness to a Warao village. The Bahanarotu is also a curer, but has a special 14033
Shamanism responsibility for a fertility cult centered on an astonishingly complex supernatural location, the House of Tobacco Smoke, where four insect couples and a blind snake play a game of chance under the watchful eye of a swallow-tailed kite. The Hoarotu (Dark Shaman) has the unpleasant but essential task of feeding human flesh to the Scarlet Macaw, the God of the West. The preferred food of the other gods is tobacco smoke, which is also the vehicle by which Warao shamans enter ASC—by hyperventilating on long cigars made of a strong black leaf tobacco.
3. Research Directions 3.1 Dead Ends and Detours There have been a number of dead ends and detours in shamanism studies. An unfortunate amount of energy was spent in fruitless discussions of the mental health of shamans—whether or not their abnormal behavior justified the application of psychiatric diagnoses (e.g., schizophrenia). Most scholars now recognize that the culturally patterned nature of shamans’ behavior and the positive value placed on their social role make the application of mental illness labels inappropriate. Another distraction came when cultural evolutionists hypothesized that shamanism was an intermediary stage in the developmental sequence of religious forms, in-between magic and institutionalized religion. Attempts to treat shamanism exclusively as an archaic expression of human religiosity flounder on the perseverance and even revitalization of shamanic traditions in cultures around the world. For example, the remarkable resiliency of shamanic practices is evident in the continuity between contemporary curanderismo (curing) on Peru’s north coast and shamanic iconography dating to the pre-Hispanic Chavin culture (approx. 900–200 BC). 3.2 Positie Directions A still limited amount of research has been directed toward the important question of how effective are shamanic treatments. This work has been troubled by serious theoretical issues (e.g., what constitutes ‘efficacy’) and thorny methodological questions (e.g., is the double blind, randomized trial possible under the unusual circumstances of shamanic rituals?). The previously cited endorphin theory has not been confirmed ethnographically. Another approach has adapted psychiatric questionnaires to before and after treatment interviews with shamanic patients. A fruitful line of investigation has focused on the relationship between shamanism and culturally constructed notions of gender. Shamans sometimes ‘bend’ gender roles (e.g., transvestitism) and the sicknesses they treat can be entangled in gender-based conflicts. Some scholars suggest that where women dominate in the shaman role there is a less adversarial model of 14034
supernatural mediation than with male shamans. The best examples of gender-focused research come from East Asia (especially Korea), where the majority of shamans are women, and from the Chilean Mapuche, among whom female and cross-dressing male shamans have predominated since at least the sixteenth century. A final body of research worth noting focuses on the capacity of shamanic traditions to survive and thrive even under the most disruptive social and political conditions. Agents of culture change, whether they are Soviet-era ideologues in Siberia or Christian missionaries in the Amazon, have been repeatedly frustrated in their attempts to obliterate shamanic practices. In recent decades, shamans have even become central figures in the politics of ethnicity and in antidevelopment protests. Scholars have shown that shamans accomplish this survival feat not by replicating ancient traditions, but by continuously reinventing them in the light of new realities and competing symbolic structures. While this strategy may occasion debates about what constitutes an ‘authentic’ shaman, it is nevertheless the key to shamanism’s continuing success.
4. A Plea for Terminological Precision The most serious threat to the academic study of shamanism lies in the broad application of the term to any religiously inspired trance form. To so dilute the concept as to make it applicable to Kung trance dancers, New Age spiritualists, and Mexican Huichol peyote pilgrims is to render it meaningless. The value of a scientific term is that it groups together phenomena that are alike in significant regards, while distinguishing those that are different. The promiscuous use of the term shaman will ultimately leave it as generic as ‘spiritual healer,’ and just as devoid of analytic value. See also: Alternative and Complementary Healing Practices; Healing; Religion and Health; Spirit Possession, Anthropology of
Bibliography Atkinson J M 1992 Shamanisms today. Annual Reiew of Anthropology 21: 307–30 Bacigalupo A M Shamans of the Cinnamon Tree, Priestesses of the Mood: Gender and healing among the Chilean Mapuche. Manuscript in preparation Brown M F 1989 Dark side of the shaman. Natural History 11 8–10 Eliade M 1972 Shamanism: Anchaic Techniques of Ecstasy. Princeton University Press, Princeton, NJ Furst P T (ed.) 1972 Flesh of the Gods: The Ritual Use of Hallucinogens. Praeger, New York Hultkrantz A 1992 Shamanic Healing and Ritual Drama. Crossroad, New York Joralemon D 1990 The selling of the shaman and the problem of informant legitimacy. Journal of Anthropological Research 46: 105–18
Shame and the Social Bond Joralemon D, Sharon D 1993 Sorcery and Shamanism: Curanderos and Clients in Northern Peru. University of Utah, Salt Lake City, UT Kalweit H 1988 Dreamtime & Inner Space: The World of the Shaman, 1st edn. Shambhala, Boston Kendall L 1985 Shamans, Housewies, and Other Restless Spirits: Women in Korean Ritual Life. University of Hawaii Press, Honolulu, HI Kleinman A, Sung L H 1979 Why do indigenous practitioners successfully heal? Social Science & Medicine 13(1B): 7–26 Peters L G, Price-Williams D 1980 Towards an experiential analysis of shamanism. American Ethnologist 7: 398–418 Prince R 1982 Shamans and endorphins: Hypotheses for a synthesis. Ethos 10(4): 409–23 Reichel-Dolmatoff G 1975 The Shaman and the Jaguar. Temple University Press, Philadelphia, PA Shirokogoroff S M 1935 The Psychomental Complex of the Tungus. Kegan, Paul, Tranch and Trubner, London Silverman J 1967 Shamanism and acute schizophrenia. American Anthropologist 69: 21–31 Sullivan L E 1988 Icanchu’s Drum: An Orientation to Meaning in South American Religions. Macmillan, New York Wilbert J 1993 Mystic Endowment: Religious Ethnography of the Warao Indians. Harvard University Press, Cambridge, MA
D. Joralemon
Shame and the Social Bond Many theorists have at least implied that emotions are a powerful force in social process. Although Weber didn’t refer to emotions directly, his emphasis on values implies it, since values are emotionally charged beliefs. Especially in his later works, Durkheim proposed that collective sentiments created social solidarity through moral community. G. H. Mead proposed emotion as an important ingredient in his social psychology. For Parsons it is a component of social action in his AGIL scheme (Parsons and Shils 1955). Marx implicated emotions in class tensions in the solidarity of rebelling classes. Durkheim proposed that ‘… what holds a society together—the ‘glue’ of solidarity—and [Marx implied that] what mobilizes conflict—the energy of mobilized groups—are emotions’ (Collins 1990). But even the theorists who dealt with emotions explicitly, Durkheim, Mead, and Parsons, did not develop concepts of emotion, investigate their occurrence, nor collect emotion data. Their discussions of emotion, therefore, have not borne fruit. The researchers whose work is reviewed here took the step of investigating a specific emotion.
1. Seen Pioneers in the Study of Social Shame Five of the six sociologists reviewed acted independently of each other. In the case of Elias and Sennett, their discovery of shame seems forced upon them by their data. Neither Simmel nor Cooley define what
they mean by shame. Goffman only partially defined embarrassment. The exception is Helen Lynd, who was self-conscious about shame as a concept. Lynd’s book on shame was contemporaneous with Goffman’s first writings on embarrassment and realized their main point: face-work meant avoiding embarrassment and shame. Helen Lewis’s empirical work on shame (1971) was strongly influenced by Lynd’s book. She also was sophisticated in formulating a concept of shame, and in using systematic methods to study it. Sennett’s work involved slight outside influence. He approvingly cited the Lynd book on shame in The Hidden Injuries of Class (1972), and his (1980) has a chapter on shame. 1.1 Simmel: Shame and Fashion Shame plays a significant part in only one of Simmel’s essays, on fashion (1904). People want variation and change, he argued, but they also anticipate shame if they stray from the behavior and appearance of others. Fashion is the solution to this problem, since one can change along with others, avoiding being isolated, and therefore shame (p. 553). Simmel’s idea about fashion implies conformity in thought and behavior among one group in a society, the fashionable ones, and distance from another, those who do not follow fashion, relating shame to social bonds. There is a quality to Simmel’s treatment of shame that is somewhat difficult to describe, but needs description, since it characterizes most of the other sociological treatments reviewed here. Simmel’s use of shame is casual and unselfconscious. His analysis of the shame component in fashion occurs in a single long paragraph. Shame is not mentioned before or after. He doesn’t conceptualize shame or define it, seeming to assume that the reader will know the meaning of the term. Similar problems are prominent in Cooley, Elias, Sennett, and Goffman. Lynd and Lewis are exceptions, since they both attempted to define shame and locate it with respect to other emotions. 1.2 Cooley: Shame and the Looking Glass Self Cooley (1922), like Simmel, was direct in naming shame. For Cooley, shame and pride both arose from self-monitoring, the process that was at the center of his social psychology. His concept of ‘the looking glass self,’ which implies the social nature of the self, refers directly and exclusively to pride and shame. But he made no attempt to define either emotion. Instead he used the vernacular words as if they were selfexplanatory. To give just one example of the ensuing confusion: in English and other European languages, the word pride used without qualification usually has an inflection of arrogance or hubris (pride goeth before the fall). In current usage, in order to refer to the kind 14035
Shame and the Social Bond of pride implied in Cooley’s analysis, the opposite of shame, one must add a qualifier like justified or genuine. Using undefined emotion words is confusing. However, Cooley’s analysis of self-monitoring suggests that pride and shame are the basic social emotions. His formulation of the social basis of shame in self-monitoring can be used to amend Mead’s social psychology. Perhaps the combined Mead–Cooley formulation can solve the inside–outside problem that plagues psychoanalytic and other psychological approaches to shame, as I suggest below. 1.3 Elias: Shame in the Ciilizing Process Elias undertook an ambitious historical analysis of what he calls the ‘civilizing process’ (1994). He traced changes in the development of personality and social norms from the fifteenth century to the present. Like Weber, he gave prominence to the development of rationality. Unlike Weber, however, he gave equal prominence to emotional change, particularly to changes in the threshold of shame: ‘No less characteristic of a civilizing process than ‘‘rationalization’’ is the peculiar molding of the drive economy that we call ‘‘shame’’ and ‘‘repugnance’’ or ‘‘embarrassment’’.’ Using excerpts from advice manuals, Elias outlined a theory of modernity. By examining advice concerning etiquette, especially table manners, body functions, sexuality, and anger, he suggests that a key aspect of modernity involved a veritable explosion of shame. Elias showed that there was much less shame about manners and emotions in the early part of the period he studied than there was in the nineteenth century. In the eighteenth century, a change began occurring in advice on manners. What was said openly and directly earlier begins only to be hinted at, or left unsaid entirely. Moreover, justifications are offered less. One is mannerly because it is the right thing to do. Any decent person will be courteous; the intimation is that bad manners are not only wrong but also unspeakable, the beginning of repression. The change that Elias documents is gradual but relentless; by a continuing succession of small decrements, etiquette books fall silent about the reliance of manners, style, and identity on respect, honor, and pride, and avoidance of shame and embarrassment. By the end of the eighteenth century, the social basis of decorum and decency had become virtually unspeakable. Unlike Freud or anyone else, Elias documents, step by step, the sequence of events that led to the repression of emotions in modern civilization. By the nineteenth century, Elias proposed, manners are inculcated no longer by way of adult to adult verbal discourse, in which justifications are offered. Socialization shifts from slow and conscious changes by adults over centuries to swift and silent indoctrination of children in their earliest years. No justification is offered to most children; courtesy has 14036
become absolute. Moreover, any really decent person would not have to be told. In modern societies, socialization automatically inculcates and represses shame. 1.4 Richard Sennett: Is Shame the Hidden Injury of Class? Although The Hidden Injuries of Class (1972) carries a powerful message, it is not easy to summarize. The narrative concerns quotes from interviews and the authors’ brief interpretations. They do not devise a conceptual scheme and a systematic method. For this reason, readers are required to devise their own conceptual scheme, as I do here. The book is based on participant observation in communities, schools, clubs, and bars, and 150 interviews with white working class males, mostly of Italian or Jewish background, in Boston for one year beginning in July of 1969 (p. 40–1). The hidden injuries that Sennett and Cobb discovered might be paraphrased: their working class men felt that first, because of their class position, they were not accorded the respect that they should have received from others, particularly from their teachers, bosses and even from their own children. That is, these men have many complaints about their status. Secondly, these men also felt that their class position was at least partly their own fault. Sennett and Cobb imply that social class is responsible for both injuries. They believe that their working men did not get the respect they deserved because of their social class, and that the second injury, lack of self-respect, is also the fault of class, rather than the men’s own fault, as most of them thought. Sennett and Cobb argue that in US society, respect is largely based on individual achievement, the extent that one’s accomplishments provide a unique identity that stands out from the mass of others. The role of public schools in the development of abilities forms a central part of Sennett and Cobb’s argument. Their informants lacked self-respect, the authors thought, because the schooling of working class boys did not develop their individual talents in a way that would allow them to stand out from the mass as adults. In the language of emotions, they carry a burden of feelings of rejection and inadequacy, which is to say chronic low self-esteem (shame). From their observations of schools, Sennett and Cobb argue that teachers single out for attention and praise only a small percentage of the students, usually those who are talented or closest to middle-class. This praise and attention allows the singled-out students to develop their potential for achievement. The large majority of the boys, however, are ignored and, in subtle ways, rejected. There are a few working class boys who achieve their potential through academic or athletic talent. But the large mass does not. For them, rather than
Shame and the Social Bond opening up the world, public schools close it off. Education, rather than becoming a source of growth, provides only shame and rejection. For the majority of students, surviving school means running a gauntlet of shame. These students learn by the second or third grade that is better to be silent in class rather than risk humiliation of a wrong answer. Even students with the right answers must deal with having the wrong accent, clothing, or physical appearance. For most students, schooling is a vale of shame. 1.5 Helen Lynd: Shame and Identity During her lifetime, Helen Lynd was a well-known sociologist. With her husband, Robert, she published the first US community studies, Middletown and Middletown in Transition. But Lynd was also profoundly interested in developing an interdisciplinary approach to social science. In her study On Shame and the Search for Identity (1961), she dealt with both the social and psychological sides of shame. She also clearly named the emotion of shame and its cognates, and located her study within previous scholarship, especially psychoanalytic studies. But Lynd also modified and extended the study of shame by developing a concept, and by integrating its social and psychological components. In the first two chapters, Lynd introduced the concept of shame, using examples from literature to clarify each point. In the next section, she critiques mainstream approaches in psychology and the social sciences. She then introduces ideas from lesser known approaches, showing how they might resolve some of the difficulties. Finally, she has an extended discussion of the concept of identity, suggesting that it might serve to unify the study of persons by integrating the concepts of self, ego, and social role under the larger idea of identity. Lynd’s approach to shame is much more analytical and self-conscious than the other sociologists reviewed here. They treated shame as a vernacular word. For them, shame sprung out of their data, unavoidable. But Lynd encounters shame deliberately, as part of her exploration of identity. Lynd explains that shame and its cognates get left out because they are deeply hidden, but at the same time pervasive. She makes this point in many ways, particularly in the way she carefully distinguishes shame from guilt. One idea that Lynd develops is profoundly important for a social theory of shame and the bond, that sharing one’s shame with another can strengthen the relationship: ‘The very fact that shame is an isolating experience also means that … sharing and communicating it … can bring about particular closeness with other persons’ (Lynd 1961, p. 66). In another place, Lynd went on to connect the process of risking the communication of shame with the kind of role-taking that Cooley and Mead had described: ‘communicating
shame can be an experience of … entering into the mind and feelings of another person’ (p. 249). Lynd’s idea about the effects of communicating and not communicating shame was pivotal for Lewis’s (1971) concepts of acknowledged and unacknowledged shame, and their relationship to the state of the social bond, as outlined below. 1.6 Goffman: Embarrassment and Shame in Eeryday Life Although shame goes largely unnamed in Goffman’s early work, embarrassment and avoidance of embarrassment is the central thread. Goffman’s Eeryperson is always desperately worried about his image in the eyes of the other, trying to present herself with her best foot forward to avoid shame. This work elaborates, and indeed, fleshes out, Cooley’s abstract idea of the way in which the looking glass self leads directly to pride or shame. Interaction Ritual (1967) made two specific contributions to shame studies. In his study of face work, Goffman states what may be seen as a model of ‘face’ as the avoidance of embarrassment, and losing face as suffering embarrassment. This is an advance, because it offers readily observable markers for empirical studies of face. The importance of this idea is recognized, all too briefly, at the beginning of Brown and Levinson’s (1987) study of politeness behavior. Goffman’s second contribution to the study of shame was made in a concise essay on the role of embarrassment in social interaction (1967). Unlike any of the other shame pioneers in sociology, he begins the essay with an attempt at definition. His definition is a definite advance, but it also foretells a limitation of the whole essay, since it is behavioral and physiological, ignoring inner experience. Framing his analysis in what he thought of as purely sociological mode, Goffman omitted feelings and thoughts. His solution to the inside–outside problem was to ignore most of inner experience, just as Freud ignored most of outside events. However, Goffman affirms Cooley’s point on the centrality of the emotions of shame and pride in normal, everyday social relationships, ‘One assumes that embarrassment is a normal part of normal social life, the individual becoming uneasy not because he is personally maladjusted but rather because he is not … embarrassment is not an irrational impulse breaking through social prescribed behavior, but part of this orderly behavior itself’ (1967, pp. 109, 111). Even Goffman’s partial definition of the state of embarrassment represents an advance. One of the most serious limitations of current contributions to the sociology of emotions is the lack of definitions of the emotions under discussion. Much like Cooley, Elias and Sennett, Kemper (1978) offers no definitions of emotions, assuming that they go without saying. Hochshild (1983) attempts to conceptualize various 14037
Shame and the Social Bond emotions in an appendix, but doesn’t go as far as to give concrete definitions of emotional states. Only in Retzinger (1991, 1995) can conceptual and operational definitions of the emotions of shame and anger be found.
2. Lewis’s Discoery of Unacknowledged Shame Helen Lewis’s book on shame (1971) involved an analysis of verbatim transcripts of hundreds of psychotherapy sessions. She encountered shame because she used a systematic method for identifying emotions, the Gottschalk–Gleser method (Gottschalk et al. 1969, Gottschalk 1995), which involves use of long lists of keywords that are correlated with specific emotions. Lewis found that anger, fear, grief, and anxiety cues showed up from time to time in some of the transcripts. She was surprised by the massive frequency of shame cues. Her most relevant findings: (a) Prevalence: Lewis found a high frequency of shame markers in all the sessions, far outranking markers of all other emotions combined. (b) Lack of awareness: Lewis noted that patient or therapist almost never referred to shame or its near cognates. Even the word embarrassment was seldom used. In analyzing the context in which shame markers occurred, Lewis identified a specific context: situations in which the patient seemed to feel distant from, rejected, criticized, or exposed by the therapist. However, the patient’s showed two different, seemingly opposite responses in the shame context. In one, the patient seemed to be suffering psychological pain, but failed to identify it as shame. Lewis called this form overt, undifferentiated shame. In a second kind of response, the patient seemed not to be in pain, revealing an emotional response only by rapid, obsessional speech on topics that seemed somewhat removed from the dialogue. Lewis called this second response bypassed shame.
3. Shame, Anger, and Conflict In her transcripts, Lewis found many episodes of shame that extended over long periods of time. Since emotions are commonly understood to be brief signals (a few seconds) that alert us for action, the existence of long-lasting emotions is something of a puzzle. Lewis’s solution to this puzzle may be of great interest in the social sciences, since it provides an emotional basis for longstanding hostility, withdrawal, or alienation. She argued that her subjects often seemed to have emotional reactions to their emotions, and that this loop may extend indefinitely. She called these reactions ‘feeling traps.’ The trap that arose most frequently in her data involved shame and anger. A patient interprets an expression by the therapist as hostile, rejecting or critical, and responds with shame or embarrassment. However, the patient instantaneously masks the shame with anger, then is ashamed of being angry. 14038
Apparently, each emotion in the sequence is brief, but the loop can go on forever. This proposal suggests a new source of protracted conflict and alienation, one hinted at in Simmel’s treatment of conflict. Although Lewis didn’t discuss other kinds of spirals, there is one that may be as important as the shame–anger loop. If one is ashamed of being ashamed, it is possible to enter into a shame–shame loop that leads to silence and withdrawal. Elias’s work on modesty implies this kind of loop.
4. Shame and the Social Bond Finally, Lewis interpreted her findings in explicitly social terms. She proposed that shame arises when there is a threat to the social bond, as was the case in all of the shame episodes she discovered in the transcripts. Every person, she argued, fears social disconnection from others. Lewis’s solution to the outside–inside problem parallels and advances the Darwin–Mead–Cooley definition of the social context of shame. She proposed that shame is a bodily and\or mental response to the threat of disconnection from the other. Shame, she argued, can occur in response to threats to the bond from the other, but in can also occur in response to actions in the ‘inner theatre,’ in the interior monologue in which we see ourselves from the point of view of others. Her reasoning fits Cooley’s formulation of shame dynamics, and also Mead’s (1934) more general framework: the self is a social construction, a process constructed from both external and internal social interaction, in role-playing and role-taking.
5. Shame as the Social Emotion Drawing upon the work of these pioneers, it is possible to take further steps toward defining shame. By shame, I mean a large family of emotions that includes many cognates and variants, most notably embarrassment, humiliation and related feelings such as shyness, that involve reactions to rejection or feelings of failure or inadequacy. What unites all these cognates is that they involve the feeling of a threat to the social bond. That is, I use a sociological definition of shame, rather than the more common psychological one (perception of a discrepancy between ideal and actual self). If one postulates that shame is generated by a threat to the bond, no matter how slight, then a wide range of cognates and variants follow: not only embarrassment, shyness, and modesty, but also feelings of rejection or failure, and heightened self-consciousness of any kind. Note that this definition usually subsumes the psychological one, since most ideals are social, rather than individual. If, as proposed here, shame is a result of threat to the bond, shame would be the most social of the basic emotions. Fear is a signal of danger to the body, anger a signal of frustration, and so on. The sources of fear
Shared Belief and anger, unlike shame, are not uniquely social. Grief also has a social origin, since it signals the loss of a bond. But bond loss is not a frequent event. Shame on the other hand, following Goffman, since it involves even a slight threat to the bond, is pervasive in virtually all social interaction. As Goffman’s work suggests, all human beings are extremely sensitive to the exact amount of deference they are accorded. Even slight discrepancies generate shame or embarrassment. As Darwin (1872) noted, the discrepancy can even be in the positive direction; too much deference can generate the embarrassment of heightened self-consciousness. Especially important for social control is a positive variant, a sense of shame. That is, shame figures in most social interaction because members may only occasionally feel shame, but they are constantly anticipating it, as Goffman implied. Goffman’s treatment points to the slightness of threats to the bond that lead to anticipation of shame. For that reason, my use of the term shame is much broader than its vernacular use. In common parlance, shame is an intensely negative, crisis emotion closely connected with disgrace. But this is much too narrow if we expect shame to be generated by even the slightest threat to the bond.
Gottschalk L A 1995 Content Analysis of Verbal Behaior. Lawrence Erlbaum Associates, Hillsdale, NJ Gottschalk L, Winget C, Gleser G 1969 Manual of Instruction for Using the Gottschalk–Gleser Content Analysis Scales. University of California Press, Berkeley, CA HochshildA R 1983The ManagedHeart. Universityof California Press, Berkeley, CA Kemper T O 1978 A Social Interactional Theory of Emotions. Wiley, New York Lewis H B 1971 Shame and Guilt in Neurosis. International Universities Press, New York Lynd H M 1961 On Shame and the Search for Identity. Science Editions, New York Mead G H 1934 Mind, Self, and Society. University of Chicago Press, Chicago Parsons T, Shils E l951 Toward a General Theory of Action. Harvard University Press, Cambridge Retzinger S M 1991 Violent Emotions. Sage, Newbury Park, CA Retzinger S M 1995 Identifying shame and anger in discourse. American Behaioral Scientist 38: 1104–13 Sennett R 1980 Authority. Alfred Knopf, New York Sennett R, Cobb J 1972 The Hidden Injuries of Class, 1st edn. Knopf, New York Simmel G 1904 Fashion. International Quarterly X: 130–55 [Reprinted in the American Journal of Sociology 62: 541–59]
T. J. Scheff
6. Conclusion The classic sociologists believed that emotions are crucially involved in the structure and change of whole societies. The authors reviewed here suggest that shame is the premier social emotion. Lynd’s work, particularly, suggests how acknowledgement of shame can strengthen bonds, and by implication, lack of acknowledgment can create alienation. Lewis’s work further suggests how shame–anger loops can create perpetual hostility and alienation. Acknowledged shame could be the glue that holds relationships and societies together, and unacknowledged shame the force that divides them. See also: Civilizational Analysis, History of; Emotion: History of the Concept; Emotions, Evolution of; Emotions, Sociology of; Identity: Social; Moral Sentiments in Society; Norms; Values, Sociology of
Bibliography Brown P, Levinson S C 1987 Politeness: Some Uniersals in Language Usage. Cambridge University Press, Cambridge Collins R 1990 Stratification, emotional energy, and the transient emotions. In: Kemper T D (ed.) Research Agendas in the Sociology of Emotions. State University of New York Press, Albany, NY, pp. 27–57 Cooley C H 1922 Human Nature and the Social Order. C. Scribner’s Sons, New York Darwin C 1872 The Expression of Emotion in Men and Animals. John Murray, London Elias N 1994 The Ciilizing Process. Blackwell, Oxford, UK Goffman E 1967 Interaction Ritual. Anchor, New York
Shared Belief 1. Introducing the Concept of Shared Belief The importance of the notion of shared belief has been emphasized by philosophers, economists, sociologists, and psychologists at least since the 1960s (the earlier contributors include Schelling 1960, Scheff 1967, Lewis 1969, and Schiffer 1972). This article deals with the strongest kind of shared belief, to be called ‘mutual belief.’ While a belief can be shared in the trivial sense of two or more people having the same belief, mutual belief requires the kind of strong sharing which requires at least the participants’ awareness (belief ) of their similar belief. Mutual belief is one central kind of collective attitude, examples of others being collective intentions, wants, hopes, and fears. Understandably, collective attitudes are central explanatory notions in the social sciences, as one of the tasks of these sciences is to study collective phenomena, including various forms of collective thinking and acting (see below for illustrations). There is some special, additional interest in the notion of shared belief as mutual belief. First, mutual beliefs serve to characterize social or intersubjective existence in a sense that does not rely on the participants’ making agreements or contracts. Thus, many social relations, properties, and events arguably involve mutual beliefs (cf. Lewis 1969, Ruben 1985, Lagerspetz 1995, Tuomela 1995). As a simple example, think of the practice of two persons, A and B, shaking 14039
Shared Belief hands. It presupposes that A believes that B and A are shaking hands, and that A also believes that B believes similarly; and B must believe analogously. Similarly, communication has been argued by many philosophers, especially Grice, to involve mutual belief (cf. Schiffer 1972, Grice 1989). Second, characterizations of many other collective attitudes in fact depend on the notion of mutual belief (cf. Balzer and Tuomela 1997). In social psychology and sociology theoreticians often speak about consensus instead of mutual belief. The notion of consensus—with the core meaning ‘mutual belief’—has been regarded as relevant to such topics as public opinion, values, mass action, norms, roles, communication, socialization, and group cohesion. It can also be mentioned that fads, fashions, crazes, religious movements, and many other related phenomena have been analyzed partly in terms of shared beliefs, consensus, shared consensus, mutual belief, or some similar notions. As pointed out by the sociologist Scheff (1967), such analyses have often gone wrong because they have treated consensus merely as shared first-order belief. Thus, as Scheff argues, consensus as mere first-order agreement does not properly account for ‘pluralistic ignorance’ (where people agree but do not realize it) and ‘false consensus’ (where people mistakenly think that they agree). Scheff proposes an analysis in terms of levels of agreement corresponding to a hierarchy of so-called loop beliefs (e.g., in the case of two persons A and B, A believes that B believes that A believes that something p). As is easily seen, pluralistic ignorance and false consensus are second-level phenomena. The third level will have to be brought in when speaking about people’s awareness of these phenomena. Other well-known social psychological notions requiring more than shared belief are Mead’s concept of ‘taking the role of the generalized other,’ Dewey’s ‘interpenetration of perspectives,’ and Laing’s metaperspectives (see, for example, Scheff 1967). There are two different conceptual–logical approaches to understanding the notion of mutual (or, to use an equivalent term, common) belief: (a) the iteratie account, and (b) the reflexie or fixed-point account. According to the iterative account, mutual belief is assumed to mean iteratable beliefs or dispositions to believe (cf. Lewis 1969, Chap. 2, and, for the weaker account in terms of dispositions to come to believe, Tuomela 1995, Chap. 1). In the two-person case, mutual belief amounts to this according to the iterative account: A and B believe that p, A believes that B believes that p (and similarly for B), A believes that B believes that A believes that p (and similarly for B); and the iteration can continue as far as the situation demands. In the case of loop beliefs there is accordingly mutual awareness only in a somewhat rudimentary sense. As will be seen, in many cases one needs only two iterations for functionally adequate mutual belief: A and B believe that p and they also believe that they believe that p. However, there are 14040
other cases in which it may be needed to go higher up in the hierarchy. The fixed-point notion of mutual belief can be stated as follows: A and B mutually believe that p if and only if they believe that p and also believe that it is mutually believed by them that p. No iteration of beliefs is at least explicitly involved here. Correspondingly, a clear distinction can be made between the iterative or the level-account and the fixed-point account (to be briefly commented on in Sect. 4). One can speak of the individual (or personal) mode and the group mode of having an attitude such as belief or intention. This article deals with the general notion of mutual belief, be it in the individual mode or in the group mode. Some remarks on the present distinction are anyhow appropriate here. The groupmode sense, expressible by ‘We, as a group, believe that p,’ requires that the group in question is collectively committed to upholding its mutual belief or at least to keeping the members informed about whether it is or can be upheld. This contrasts with mutual belief in an aggregative individual mode involving only personal commitments to the belief in question. When believing in the group-mode sense, group members are accordingly committed to a certain shared view of a topic and to group-mode thoughts such as ‘We, as a group, believe that p.’ Group-mode beliefs are central in the analysis of the kinds of beliefs that structured social groups such as organizations and states have. According to the ‘positional’ account defended by Tuomela (1995), the group members authorized for belief or view formation collectively accept the views which will qualify as the beliefs the group has. These views are group-mode views accepted for the group and are strictly speaking acceptances of something as the group’s views rather than beliefs in the strict sense.
2. Mutual Beliefs More Precisely Characterized This section will be concerned with what mutual beliefs involve over and above plain shared beliefs, and the Sect. 3 takes up the problem of how many layers of hierarchical beliefs are conceptually or psychologically needed. (The discussion below draws on the treatment in Tuomela 1984 and 1995.) Consider now the iterative account starting with the case of two persons, A and B. Recall from Sect 1 that according to the standard iterative account A and B mutually believe that p if and only if A believes that p, B believes that p, A believes that B believes that p (and similarly for B), A believes that B believes that A believes that p (and similarly for B), and so on, in principle ad infinitum. In the general case, the account defines that it is mutually believed in a group, say G, that p if and only if a) everyone in G believes that p, b) everyone believes that everyone believes that p, c) everyone believes that everyone believes that everyone
Shared Belief believes that p, and so on ad infinitum. The word ‘everyone’ can of course be qualified if needed and made dependent on some special characteristics; for example, it can be restricted to concern only every fully fledged, adequately informed, and suitably rational member of the group. A major problem with the iterative account is that it seems in some cases it is psychologically realistic. It loads the people’s minds with iterated beliefs which people, after all, do not experientially have and possibly, because of lack of rationality and memory, cannot have. Thus, the account must be improved to make it better correspond to psychological reality. One way to go is to operate partly in terms of lack of disbelief. A person’s lack of disbelief that p is defined as that it is not the case the person believes the negation of p. One can try to define mutual belief schematically by saying that it is mutually believed in G that p if and only if everyone believes that p, iterated n times, and from level nj1 on everyone lacks the disbelief that p. With n l 2, viz. two levels actually present, this definition says that mutual belief amounts to everyone’s believing that p and everyone’s believing that everyone believes that p and that everyone lacks the disbelief that p from the second level on. The basic reason for using this amended iterative approach is that no more levels of positive belief than people actually have should be required. Provided that the value of n can be established, the present account satisfies this. It must be assumed here that the agents in question are to some degree intelligent and rational as well as free from emotional and other disturbances so that they, for instance, do not lack a higher-order belief when they really ought to have one in order to function optimally. While the iterative account amended with the use of the notion of lack of disbelief seems viable for some purposes, an alternative analysis is worth mentioning. This analysis amends the original iterative account in that it requires iterative beliefs up to some level n and from that level on requires only the disposition to acquire higher-order beliefs in appropriate conditions. These appropriate conditions, serving to determine the value of n, include both background conditions and more specific conditions needed for acquiring the belief in question. First, the agents in G must be assumed to be adequately informed and share both general cultural and group-specific information; especially, they must have the same standards of reasoning so that they are able to ‘work out the same conclusions’ (see Lewis 1969, Chap. 2, and Heal 1978). Second, they must be sufficiently intelligent and rational, and they must be free from cognitive and emotional disturbances to a sufficient degree (so that there are no psychological obstacles for adding the nj1st order belief when a genuine mutual belief is involved and for not adding it, or indeed for adding its negation, when a genuine mutual belief is not involved). A typical releasing condition for the dis-
position to acquire a higher-order belief would be simply that the agents are asked about higher-order beliefs (or are presented with other analogous problems concerning them). There is not much to choose between the two amended accounts, and the matter will not be discussed further here. With the conceptual machinery at hand, one can deal with, for instance, the mentioned phenomena of pluralistic ignorance and false consensus. An account of pluralistic ignorance, concerning something p, must obviously include the idea that everyone believes that p. Second, it must say that the agents do not have the belief that they agree. Compatibly with this, it can still be required that that they do not disbelieve that they believe that p. As to false consensus, it must be required that not everyone believes that p and that, nevertheless, it is believed by everyone that everyone believes that p.
3. Mutual Beliefs and the Leel Problem Let us next discuss the level problem. Given the iterative account, the problem is how many iterative levels of belief are conceptually, epistemically, or psychologically needed for success in various cases. The criteria here are functionality and success. For instance, how many levels of belief does successful joint action require? The level question is difficult to answer in general terms. However, it can be conjectured that in certain cases at least two levels (viz. n l 2), relative to the base level, are needed for mutual belief, but no more. In some other cases this is not enough. Generally speaking, there are very seldom epistemic or psychological reasons for going beyond the fourth level. For many agents level n l 4 probably is psychologically impossible or at least very difficult to handle in one’s reasoning. The base level is the level zero at which the sentence or proposition p is located, and p will concern or be about an agent’s or agents’ actions, dispositions to act, intentions, beliefs (or something related); or, alternatively, p can be (or concern) a belief content. In the analysis below it is accepted that the beliefs—at least those going beyond the level n l 2—need not be proper occurrent or standing beliefs or even subconscious ones but only dispositions to form the belief in question. (See Audi 1994, for the relevant notion of a disposition to form a belief.) The claim that n l 2 is not only necessary but also sufficient in many typical cases is allowed to involve social loop beliefs where the content of mutual belief could be, for instance, that A believes that B believes that A will act in a certain way, or believes that such and such is the case. What the base level will be is relative to the topic at hand. That at least two levels are needed can be regarded as a conceptual truth (related to the notion ‘social existence’); that in many typical cases no more is needed or that in some other 14041
Shared Belief cases four layers are needed can be regarded as general psychological theses concerning the actual contents of people’s minds in those situations. The present view allows that a need may arise to add layers indefinitely. There follows a recursive argument to show that such a situation is possible in principle. Given an analysis of mutual belief in terms of iteration of beliefs such as ours, then at each level under consideration one may start asking questions about people’s realizing (or believing) such and such, where ‘such and such’ refers to the level at hand. This game obviously can go on indefinitely as far as the involved agents’ reasoning capacities (or whatever relevant capacities) permit. This is an inductive argument (in the mathematical sense—the first step is obvious) for the open and infinite character of the hierarchy of nested beliefs. To discuss the level problem in more concrete terms, let us consider mutual belief in the case of two agents, A and B. First we let our base-level sentences be related to the agents’ performances of their parts as follows: (a) A will do his part of X (in symbols, p(A)); (b) B will do his part of X ( p(B)). Next, let us assume: (i) A believes that (a) and (b); (ii) B believes that (a) and (b). This is a social or mutual recognition of the agents’ part-performances in the context of joint action. The sentence p in (a) and (b) may alternatively concern other things than action. Thus it may represent a belief content, for example p l The earth if flat, and we will consider this case below. Considering joint action, it can be argued that n l 2 is both necessary and sufficient for successful performance. In this example, A will obviously have to believe that B will do her part. This gives her some social motivation to perform her own part. Concentrating on our reference point individual A and genuine joint action X (in which the agents’ performances of their parts are interdependent), A must also believe (or be disposed to believe, at any rate, viz. come to form the belief if asked about the matter) that B believes that A will do her part. For otherwise A could not believe with good reason that B will do her part, for B would—if she does not so believe—lack the social motivation springing from her belief that A will do her part. However, if she did lack that motivation, A could not on this kind of ground defensibly form the belief that B will perform her part. This argument shows that it must be required that the agents have a loop belief or at least a disposition to have a loop belief: (iii) A believes that (i) and (ii); (iv) B believes that (i) and (ii). For instance, (iii) gives the loop ‘A believes that B believes that p(A),’ and this means that in the present kind of case n l 2 is necessary and normally also sufficient (relative to the chosen base level). 14042
An analogous claim can be defended in the case of belief. Here the defense can be given in terms of the recognition by the agents that they share a belief. Suppose that each member of the Flat Earth Society not only believes that the Earth is flat, but—because of being organized into a belief group—the members all share this information. This second-order belief indicates the first level at which social and mutual belief can be spoken of in an interesting sense, for at this level people recognize that the others in the group believe similarly, and this establishes a doxastic social connection; at level one that recognition is, however, missing. This is a typical case of a social property, and again n l 2 is both necessary and normally sufficient for acting in matters related to the shape of the Earth qua a member of the society. Suppose next the members of the Flat Earth Society now begin to reflect upon this second-order belief of theirs and come to form the belief that indeed it does exist. (Perhaps a sociologist has just discovered that they share the second-order belief and told them about it.) They might wonder how strong this second-order awareness is, and so on. All this entails that on some occasions third-order beliefs may be needed, although the standard case clearly would be second-order beliefs accompanied by a disposition to go higher up, if the need arises. Analogously, in the case of joint action in some very special cases, for example, double loop beliefs may be needed for rational action. The reader should be reminded that not all social notions depend on mutual belief. For instance, latent social influence and power do not require it, nor does a unilateral love relation between two persons.
4. Mutual Belief as Shared We-belief What has been called ‘we-attitudes’ are central for an account of social life (see Tuomela, 1995, Chap. 1, Tuomela and Balzer 1999). A we-attitude is a person’s attitude (say belief, want, fear, etc.) concerning something p (a proposition, or sentence) such that (1) this person has the attitude in question and believes that (2) everyone in the group has the attitude and also believes that there is a mutual belief to the effect that (2) in the group. Of these, clause (1) is required because there cannot of course be a truly shared attitude without all the members participating in that attitude. Clause (2) gives a social reason for adopting and having the attitude A, and (3) strengthens the reason by making it intersubjective. (A shared we-attitude can be either in the group mode or in the individual mode; cf. Sect. 1.) A shared we-belief is a we-belief (be it in the individual or in the group mode) which (ideally) all the group members have. To take an example, a shared we-belief that the Earth is flat entails that the group members believe that the Earth is flat, that the group members believe that the Earth is flat, and also that it
Sherrington, Sir Charles Scott (1857–1952) is a mutual belief in the group that the group members believe that the Earth is flat. Shared we-beliefs can be related to the reflexive or fixed-point account of mutual belief. According to the simplest fixed-point account, mutual belief is defined as follows: it is a mutual belief in a group that p if and only if everyone in the group believes that p and also that it is mutually believed in the group that p. It can thus be seen that the account of mutual belief given by the fixed-point theory is equivalent to the definiens in the definition of a shared we-belief. In the fixed-point approach the syntactical infinity involved in the iterative approach is cut short by a finite fixed-point formula, that is an impredicative construct in which the joint notion to be ‘defined’ already occurs in the definiens. Under certain rationality assumptions about the notion of belief it can be proved that the iterative approach, which continues iterations ad infinitum, gives the fixed-point property as a theorem (see Halpern and Moses 1992 and, for a more general account, Balzer and Tuomela 1997). The fixed-point account is in some contexts psychologically more realistic, as people are not required to keep iterative hierarchies in their minds. Note, however, that it depends on context whether the iterative approach or the fixed-point approach is more appropriate. Thus, in the case of successful joint action, at least loop beliefs must be required. See also: Action, Collective; Collective Behavior, Sociology of; Collective Beliefs: Sociological Explanation; Collective Identity and Expressive Forms; Collective Memory, Anthropology of; Collective Memory, Psychology of; Genes and Culture, Coevolution of; Human Cognition, Evolution of; Memes and Cultural Viruses; Religion: Culture Contact; Religion: Evolution and Development; Religion, Sociology of
Scheff R 1967 Toward a sociological model of consensus. American Sociological Reiew 32: 32–46 Schelling R 1960 The Strategy of Conflict. Harvard University Press, Cambridge, MA Schiffer S 1972 Meaning. Oxford University Press, Oxford, UK Tuomela R 1984 A Theory of Social Action. Reidel, Dordrecht, The Netherlands Tuomela R 1995 The Importance of Us: A Philosophical Study of Basic Social Notions. Stanford Series in Philosophy, Stanford University Press, Stanford, CA Tuomela R, Balzer W 1999 Collective acceptance and collective social notions. Synthese 117: 175–205
R. Tuomela
Sherrington, Sir Charles Scott (1857–1952) Like this old Earth that lolls through sun and shade, Our part is less to make than to be made. (Sherrington 1940, p. 259)
To characterize the broad span of Sherrington’s engagement he should be described as ‘Pathologist, Physiologist, Philosopher and Poet.’ It is somewhat arbitrary to pick out only two of them because it is the combination of all these elements that made him both great and unique. However, hardly any other physiologist since 1800 made such a fundamentally important contribution to the knowledge and to the development of concepts in modern neurophysiology. Moreover, when advanced in years, he was the first to attempt a contribution to the solution of the age-old problem of brain\body and mind on a well-founded neuroscientific basis.
1. The Life Bibliography Audi R 1994 Dispositional beliefs and dispositions to believe. Nous 28: 419–34 Balzer W, Tuomela R 1997 A fixed point approach to collective attitudes. In: Holmstro$ m-Hintikka G, Tuomela R (eds.) Contemporary Action Theory II. Kluwer Academic Publishers, Dordrecht, The Netherlands, pp. 115–42 Grice P 1989 Studies in the Ways of Words. Harvard University Press, Cambridge, MA Halpern J, Moses Y 1992 A guide to completeness and complexity for modal logics of knowledge and belief. Artificial Intelligence 54: 319–79 Heal J 1978 Common knowledge. Philosophical Quarterly 28: 116–31 Lagerspetz E 1995 The Opposite Mirrors: An Essay on the Conentionalist Theory of Institutions. Kluwer Academic Publishers, Dordrecht, The Netherlands Lewis D 1969 Conention, A Philosophical Study. Harvard University Press, Cambridge, MA Ruben D-H 1985 The Metaphysics of the Social World. Routledge, London
Born in London on November 27, 1857, Sherrington was brought up in Ipswich and Norwich. From his earliest years he felt a strong attraction to poetry, a love which never left him throughout his life. Some of his poems were published in various magazines and later were partly collected in a small volume (Sherrington 1925). In 1876 Sherrington first enrolled at St. Thomas’ Hospital Medical School in London and in 1881 matriculated at Cambridge University, where he received his B.A. in 1884 and his M.D. in 1885. During his undergraduate years he started his research in neuroanatomy and neuropathology, publishing his first papers on these studies in 1884 together with John Newport Langley (Langley and Sherrington 1884). This was the beginning of a very productive publishing activity with almost 320 scientific publications (for Sherrington’s complete bibliography see Fulton 1952). During his initial postgraduate period at Cambridge 14043
Sherrington, Sir Charles Scott (1857–1952) Sherrington‘sscientificinterestsconcentratedonneurohistology and pathology. This period was interrupted by expeditions to epidemics of Asian Cholera in Spain and Italy. The autopsied material taken during these expeditions he analyzed together with J. Graham Brown and Charles S. Roy in England (1885) and with Rudolf Virchow and Robert Koch in Germany (1886). In 1890 Sherrington was appointed lecturer in physiology at St. Thomas’ Hospital in London where he mainly performed clinical work. A year later he received the position of the Professor-Superintendent at the Brown Institute at the University of London, where he became remarkably productive with a variety of publications which reflected his pathological, but also his increasing neurophysiological interest. In 1895 Sherrington received his first appointment to a full professorship of physiology at the University College in Liverpool, where he investigated intensely the organization of the spinal cord. At age 56 (in 1913) Sherrington entered probably the busiest period of his scientific career when he was appointed to the chair of physiology at Oxford University, from which he did not retire until 1935, when he was 78. It was during his Oxford time that he, together with Edgar Adrian, received the Nobel Prize for Medicine in 1932 (the greatest tribute among numerous honours and rewards, like, for example the Presidency of the Royal Society 1920–25, some 21 international honorary doctorates, and many medals and honorary memberships of British and international societies and academies, among these the corresponding membership of l’Institut de France). Three of his students at Oxford were later Nobel Prize winners: Howard Walter Florey, John Carew Eccles and Ragnar Granit. In the early 1930s Sherrington performed his last physiological experiments and he felt free to turn to his other great love, the philosophy of the nervous system. Having retired to his old hometown of Ipswich in 1935, his scientific interest was increasingly enriched by philosophical considerations that became evident in the Gifford Lectures, held between 1937 and 1938. Obviously, it was also Sherrington’s interest in Natural Philosophy that caused him to work on Goethe (Sherrington 1942) and Jean Fernel, a French physician and philosopher of the sixteenth century (Sherrington 1946). Though his body was crippled by arthritis, his intellectual capacities remained undimmed until his death in Eastbourne on March 4 1952.
2. The Physiologist Regarding the nervous system as a functional unity resulting from its well-organised parts, Sherrington investigated a variety of motor and sensory systems: the topographical organisation of the motor cortex, the binocular integration in vision, as well as the 14044
different receptor systems. However, his main efforts concentrated on the spinal cord and its reflex function in motor control. He explicitly admitted that spinal reflex behaviour is not ‘the most important and farreaching of all types of nerve behavior’ but represents the advantage that ‘it can be studied free from complication with the psyche’ (Sherrington 1906\1947, Foreword, p. xiii). In the spinal cord, isolated from the brain, he carefully analyzed almost all spinal reflex-types from the ‘simple’ two-neurone reflex arc of the stretch reflex (first described by him), via the complex partly nociceptive flexion and crossed extension reflex and co-ordinated rhythmic scratching and stepping reflexes, up to goal-directed reflex movements of the limbs towards irritating stimuli. Right from the beginning he demonstrated that spinal reflexes are not following a stereotyped all-or-nothing behavior, but that the adaptability of reflexes, based on graduated amplitude, irradiation, and mutual interference between different reflex types, is of great importance for motor control. Sherrington was the first to suggest that inhibition had at least the same functional importance for motor coordination—and other nervous functions—as excitation, not only at the spinal level but also at the brain level and for descending motor commands. This view also opened up new insights into the background of motor disorders observed in various neurological syndromes. Though some of the reflexes and other motor and sensory phenomena described by Sherrington had been noted earlier, his experimental observations and their analyses were altogether more exact and careful than those of former investigations. Thus, he disproved e.g., several of Pflu$ gers’s reflex laws and the assumption of Descartes (1677) and John and Charles Bell (1826) that reciprocal inhibition is localized peripherally in the muscle. However, the higher experimental sophistication alone would not for sure have received that particularly high enduring recognition. Indeed, Sherrington’s most important contribution to neuroscience resulted from his unique capability to condense his immense knowledge on sensory and motor functions and on the structure of the nervous system (partly derived from an intense contact with the famous Spanish neurohistologist Santiago Ramo! n y Cajal) in an almost visionary synopsis of the integrative action of the nervous system, which laid the foundation for the great progress of neuroscience in the twentieth century and which has lost none of its basic validity. This synopsis made his book The Integratie Action of the Nerous System (1906) a milestone in neurophysiology, which has been compared ‘in importance with Harvey’s’ 300 years older ‘De motu cordis in marking a turning point in the history of physiological thought’ (Fulton 1952, p. 174). For Sherrington, spinal reflexes did not have a ‘purpose’ but a ‘meaning.’ He replaced the formerly presumed ‘soul’ of the spinal cord (e.g., Legallois 1812, Pflu$ ger
Sherrington, Sir Charles Scott (1857–1952) 1853) as the driving force for reflexes and even voluntary movements by an astonishing exactly considered function of spinal interneuronal activity. He replaced the Cartesian reflex behaviorist view by the still-relevant view that the performance of coordinated movements as a basis for animal and ‘conscious’ human behavior requires an integration of ‘intrinsically’ generated brain functions with ‘extrinsically’ induced reflex functions. In this context he suggested that the pure ‘apsychical’ reflex behavior without a contribution of ‘mind’ loses in importance with phylogenetic ascendance, playing the smallest role in man: ‘The spinal man is more crippled than is the spinal frog’ (Sherrington 1906\1947, Foreword, p. xiv). As a true scientist Sherrington never felt that he had arrived at any incontrovertible truth and in a typically unpretentious understatement he concluded: ‘I have dealt very imperfectly with a small bit of a large problem…. The smallness of the fragment is disappointing’ (1931 p. 27); this a statement from a man about whom Frederick George Donnan wrote in 1921 to the Nobel Committee: ‘In the field of physiology and medicine, Professor Sherrington’s works remind one of that of Kepler, Copernicus and Newton in the field of mechanics and gravitation’ (in H. Schu$ ck et al., 1962, p. 310).
3. The Philosopher Like for example, Descartes before and Eccles after him, Sherrington realised that mere experimental investigation of the nervous system was not sufficient to solve the fundamental problem of the body\ brain–mind relation, particularly the difficulty of explaining how it actually works. In the Rede Lecture on The Brain and its Mechanism (Sherrington 1933) in which he considered the complexity of the brain functions in motor control and behavior and in which he tentatively approached this problem, Sherrington took a fully dualistic view. He even negated any scientific right to ‘conjoin mental experience with the physiological…. The two … seem to remain disparate and disconnected’ (1933, pp. 22–3). But later in the Gifford Lectures, which he held in Edinburgh in 1937–1938 (published as Man on his Nature in Sherrington 1940 and as an extensively revised new edition in 1951) he took up the great enigma of the brain–mind relation in the light of his new neuroscientific knowledge. Rejecting the assumption of a mysterious mind of a ‘heavenly’ origin he stated, ‘Ours is an earthly mind which fits our earthly body’ (Sherrington 1940, p. 164). ‘Mind’ for Sherrington was coupled to motor acts, ‘mind’ only being recognizable via motor acts. He suggested that ‘mind’ is graduable, that a recognizable ‘mind’ is developing in the developing brain in humans and to varying degrees in animals. Regarding the relation between the two concepts of ‘energy’ (assuming ‘matter’ also as ‘energy’ in the physical sense) and
‘mind’, he concluded that ‘mind’ as a function of the living brain is mortal, while the ‘energy’ the matter—is immortal, just changing its state when the brain is dying. In his discussions he entirely avoided the complicating problem of an ‘immortal soul’, which he, as being himself oriented towards Natural Religion, left to the Revealed Religions: ‘When on the other hand the mind-concept is so applied as to insert into the human individual an immortal soul, again a trespass is committed’ (Sherrington 1940, p. 355). When Sherrington with something like resignation stated, ‘But the problem of how that [body-mind] liaison is effected remains unsolved; it remains where Aristotle left it more than 2000 years ago’ (Sherrington 1906\1947, Foreword, p. xxiii) he probably underestimated the contribution of his step towards the direction of a solution by defining ‘mind’ as a function of the brain requiring a complex integrative action of the nervous system; a view which was taken up and further developed by one of his students, Sir John C. Eccles (1970, Eccles and Gibson 1979).
4. And Now? As stated in Sect. 3 the experimentally well-based theories and concepts, which Sherrington had creatively developed with great visionary imagination, formed a well-recognised secure foundation for future developments in the neurosciences. The half-century after Sherrington’s death has brought an immense increase in neuroscientific knowledge. First, particularly, the technique of microrecordings from nerve cells and the analysis of ionic membrane mechanisms confirmed and extended the knowledge and hypotheses based on Sherrington’s activities on brain and spinal functions in sensory perception, motor control, and behavior. Then new techniques allowed for analytic investigation of more and more detailed neuronal structures, functions, and interactions down to the molecular level. New gentechnical approaches enabled the ‘production’ of innumerable variants of mice with defined genetic defects at the molecular level. However, the bewildering mass of results originating from these experiments opened up a host of new questions. What is required now is a new ‘Sherringtonian’ approach which comprises a complex critical analysis of today’s entire neuroscientific knowledge (from molecular mechanisms up to the level to complex behaviour) in an integrative synopsis and which develops new concepts to explain the sophisticated brain functions and their relation to mind as expressed by behavior. Despite the exponential increase in detailed neuroscientific knowledge, we have not really come closer to a solution of that problem since Sherrington. See also: Behavioral Neuroscience; Cognitive Neuroscience; Consciousness, Neural Basis of; Cross-modal 14045
Sherrington, Sir Charles Scott (1857–1952) (Multi-sensory) Integration; Functional Brain Imaging; Motor Control; Motor Cortex; Motor Skills, Psychology of; Neural Plasticity of Spinal Reflexes; Neural Representations of Direction (Head Direction Cells); Psychophysiology; Theory of Mind; Topographic Maps in the Brain; Vestibulo-ocular Reflex, Adaptation of the
Bibliography Bell J, Bell C 1926 Anatomy and Physiology of the Human Body. Edinburgh Descartes R 1677 De homine ed. by de la Forge. Amsterdam Eccles J C 1970 Facing Reality. Springer-Verlag, New York Eccles J C, Gibson W C 1979 Sherrington—His Life and Thought. Springer International, Berlin Fulton J F 1952 Sir Charles Scott Sherrington, O.M. Journal of Neurophysiology 15: 1667–190 Granit R 1966 Charles Scott Sherrington—An Appraisal. Nelson, London Langley J N, Sherrington C S 1884 Secondary degeneration of nerve tracts following removal of the cortex of the cerebrum in the dog. Journal of Physiology 5: 49–65 Legallois M 1812 ExpeT riences sur la principe de la ie. D’Hautes, Paris Pflu$ ger E 1853 Die sensorischen Funktionen des RuW ckenmarks der Wirbeltiere. Springer, Berlin Schu$ ck H, Sohlman R, O= sterling A, Liljestrand G, Westgren A, Siegbahn M, Schou A, Sta0 le N K 1962 Nobel, the Man and His Prizes. The Nobel Foundation, Elsevier, Amsterdam Sherrington C S 1906\1947 The Integratie Action of the Nerous System, 2nd edn, Yale University Press, New Haven, CT, with a new Foreword, 1947 Sherrington C S 1925 The Assaying of Brabantius and Other Verse. Oxford University Press, London Sherrington C S 1931 Quantitative management of contraction in lowest level co-ordination. Hughlings Jackson Lecture. Brain 54: 1–28 Sherrington C S 1933 The Brain and its Mechanism. Cambridge University Press, Cambridge, UK Sherrington C S 1940\1951 Man on his Nature, 2nd rev. edn. 1951. Cambridge University Press, Cambridge, UK Sherrington C S 1942 Goethe on Nature and on Science. Cambridge University Press, Cambridge, UK Sherrington C S 1946 The Endeaour of Jean Fernel. Cambridge University Press, Cambridge, UK
E. D. Schomburg
claimed that short-term memory is an archaic concept and there is no need to distinguish the processes involved in short-term memory from other memory processes (Crowder 1993).
1. Definitions and Terminology ‘Short-term memory’ refers to memory over a short time interval, usually 30 s or less. Another term for the same concept is ‘immediate memory.’ Both these terms have been distinguished from the related terms ‘shortterm store’ and ‘primary memory,’ each of which refers to a hypothetical temporary memory system. However, the term ‘short-term memory’ has also been used by many authors to refer to the temporary memory system. Thus, here the term ‘short-term memory’ is used in both senses. It is important to distinguish ‘short-term memory’ from the related concepts ‘working memory’ (see Working Memory, Psychology of) and ‘sensory memory’. Some authors have used the terms ‘shortterm memory’ and ‘working memory’ as synonymous, and indeed the term ‘working memory’ has been gradually replacing the term ‘short-term memory’ in the literature (and some authors now refer to ‘shortterm working memory’; see Estes 1999). However, ‘working memory’ was originally adopted to convey the idea that active processing as well as passive storage is involved in temporary memory. ‘Sensory memory’ refers to memory that is even shorter in duration than short-term memory. Further, sensory memory reflects the original sensation or perception of a stimulus and is specific to the modality in which the stimulus was presented, whereas information in shortterm memory has been coded so that it is in a format different from that originally perceived.
2. Historical Deelopment and Empirical Obserations An initial description of short-term memory was given by James (1890), who used the term ‘primary memory’ and described it as that which is held momentarily in consciousness. The intense study of short-term memory began almost 70 years later with the development of the distractor paradigm (Brown 1958, Peterson and Peterson 1959).
Short-term Memory, Cognitive Psychology of
2.1 The Distractor Paradigm
The concept of short-term memory has been of theoretical significance to cognitive psychology since the late 1950s. Some investigators have even argued that ‘all the work of memory is in the short-term system’ (Shiffrin 1999, p. 21). However, others have
In this paradigm, a short list of items (usually five or fewer) is presented to subjects for study. If the subjects recall the list immediately, perfect performance results because the list falls within their memory span. However, before the subjects recall the items, they engage in an interpolated activity, the distractor task,
14046
Short-term Memory, Cognitie Psychology of which usually involves counting or responding to irrelevant material. The purpose of the distractor task is to prevent the subjects from rehearsing the list items. The length of the distractor task varies, and its duration is called the ‘retention interval.’ By comparing performance after various retention intervals, it was found that the rate of forgetting information from short-term memory is very rapid, so that after less than 30 s little information remains about the list of items. Another central finding from the distractor paradigm involves the serial position curve, which reveals accuracy for each item in the list as a function of its position. The curve is typically bowed and symmetrical with higher accuracy for the initial and final positions than for the middle positions. The distractor task typically requires serial recall (i.e., recall in the order in which the items were presented; see Representation of Serial Order, Cognitie Psychology of). Two types of errors occur with this procedure. Transposition errors are order errors in which subjects recall an item in the wrong position. Nontransposition errors are substitution errors in which subjects replace an item with one not included in the list. For example, if subjects receive the list BKFH and recall VKBH, they make a nontransposition error in the first position and a transposition error in the third. The bowed serial position curve reflects the pattern of transposition errors. The pattern of nontransposition errors shows instead a relatively flat function with errors increasing slightly across the list positions. Errors in the distractor paradigm can also be classified by examining the relation between the correct item and the item that replaces it. In the example, the subjects replace B with V. Because those two letters have similar-sounding names, the error is called an ‘acoustic confusion error’, and such errors occur often in the distractor paradigm. Even visually presented items are typically coded in short-term memory in an acoustic or speech representation. This fact was first illustrated in an experiment in which subjects were shown a list of six consonants for immediate serial recall. The errors were classified by a confusion matrix, in which the columns indicate the letter presented and the rows indicate the letter actually recalled. A high correlation was found between the confusion matrix resulting from this memory task when subjects recalled a list of visually presented letters, and the confusion matrix resulting from a listening task when subjects heard one letter at a time and simply named it with no memory requirement (Conrad 1964).
2.2 The Free Recall Task Another paradigm commonly used in the early investigation of short-term memory was the free recall task, in which subjects are given a relatively long list of items and then recall them in any order they choose. When the list is recalled immediately, the subjects
show greater memory for the most recent items. This ‘recency effect’ was attributed to the fact that the final items in the list, but not the earlier ones, are still in short-term memory at the termination of the list presentation. Support for this conclusion came from the observation that if presentation of the list is followed by a distractor task, then there is no recency effect, although as in the case of immediate recall there is a ‘primacy effect,’ or advantage for the initial items in the list. The explanation offered for the elimination of the recency effect with the distractor task is that the final list items are no longer in short-term memory after the distractor activity. Further support for this explanation came from finding that other variables like list length and presentation rate had differential effects on the recency and earlier sections of the serial position curve. Specifically, subjects are less likely to recall an item when it occurs in a longer list for all except the most recent list positions. Likewise, subjects are less likely to recall an item in a list presented at a faster rate for all but the recency part of the serial position curve.
3. Theoretical Accounts The most widely accepted account of short-term memory was presented in the 1960s in what was subsequently termed the ‘modal model’ because of its popularity. The core assumption of that model is the distinction between short-term memory, which is transient, and long-term memory, which is relatively permanent. The fullest description of the modal model was provided by Atkinson and Shiffrin (1968), who also distinguished between short-term memory and sensory memory.
3.1 The Buffer Model Atkinson and Shiffrin’s ‘buffer model,’ as it has been called, is characterized along two dimensions (Atkinson and Shiffrin 1968). Following a computer analogy, the first dimension involves the structural features of the system, analogous to computer hardware, and the second dimension involves the ‘control processes,’ or operations under the control of the subjects, analogous to computer software. The structural features of the buffer model include the sensory registers (with different registers for each sense), shortterm store, and long-term store. The control process emphasized is rote rehearsal, which takes place in part of short-term store called the ‘buffer.’ The rehearsal buffer has a small capacity with a fixed number of slots (about four). To account for performance in the free recall task, it is assumed that each item enters the buffer and when the buffer is full the newest item displaces a randomly selected older item. While an item is in the buffer, 14047
Short-term Memory, Cognitie Psychology of information about it is transferred to long-term store, with the amount of information transferred a linear function of the time spent in the buffer. Although information about an item thereby gets transferred to long-term store, the item remains in the buffer until it is displaced by an incoming item. At test, subjects first respond with any items in the buffer and then try to retrieve other items from long-term store, with the number of retrieval attempts fixed (see Memory Retrieal). These assumptions allow the model to account for the various empirical observations found with the free recall task. The model accounts for the recency effect and its elimination with a distractor task because the final items are still in the buffer immediately after list presentation but get displaced from the buffer by the interpolated material of the distractor task. The model accounts for the primacy effect by assuming that the buffer starts out empty so the initial items reside in the buffer longer than subsequent items because they are not subject to displacement until the buffer is full. The effect of list length is due to the fixed number of attempts to retrieve information from longterm store. The longer the list, the smaller is the likelihood of finding a particular item. The effect of presentation rate is due to the linear function for transferring information from short-term to long-term store. More information is transferred when the rate is slower so that retrieving an item from long-term store is more likely at a slower rate. There have been numerous refinements and expansions of the buffer model since it was first proposed. These updated versions have been termed ‘SAM’ (search of associative memory) and ‘REM’ (retrieving effectively from memory). However, these refinements have been focused on the search processes in long-term memory, and the short-term memory component of the model remains largely intact (Shiffrin 1999). 3.2 The Perturbation Model Whereas the buffer model explains the results of the free recall task, a popular model proposed by Estes (1972) explains the results of the distractor paradigm. According to this ‘perturbation model’, the representation in memory of each item in a list is associated with a representation of the experimental context. This contextual representation is known as a ‘control element.’ It is assumed that if the control element is activated, then the representations of the items associated with it are activated in turn, allowing for their recall. Forgetting is attributed in part to the fact that the context shifts with time so that subjects may be unable to activate the appropriate control element after a delay. Reactivation of the item representations by the control element does not only occur at the time of test. In addition, there is a reverberatory loop providing a periodic recurrent reactivation of each item’s representation, with the difference in reacti14048
vation times for the various items reflecting the difference in their initial presentation times. This reverberatory activity provides the basis for the shortterm memory of the order of the items in a list. Because of random error in the reactivation process, timing perturbations result, and these perturbations may be large enough to span the interval separating item representations, thereby leading to interchanges in the order of item reactivations and, hence, transposition errors during recall. Because such interchanges can occur in either the forward or backward direction for middle items in the list but in only one direction for the first and last items, the bowed serial position function for transposition errors is predicted. The symmetry in the serial position function is predicted with the assumption that timing perturbations do not start until all the list items have been presented. Because interchanges can also occur between the list items and the interpolated distractor items, the gradually increasing proportion of nontransposition errors is also predicted. The original version of the perturbation model included only the perturbation process responsible for short-term memory, but subsequent research documented the need to include a long-term memory process in addition to the short-term perturbation process (Healy and Cunningham 1999). Other extensions allowed the perturbation model to account for a wide range of findings in the distractor paradigm and also to provide insights into other memory processes such as those responsible for memory distortions in eyewitness situations (Estes 1999; see also Eyewitness Memory: Psychological Aspects). 3.3 Theoretical Controersies and Complications Although models of memory typically include the distinction between short- and long-term memory, some investigators have pointed to problems with some of the evidence establishing the need to postulate a distinct short-term memory. For example, Craik and Lockhart (1972) argued that short-term memory cannot be distinguished by the exclusive use of speech coding, as suggested by Conrad (1964). Craik and Lockhart proposed an alternative framework called ‘levels of processing,’ according to which information is encoded at different levels and the level of processing determines the subsequent rate of forgetting (see Memory: Leels of Processing). More recent arguments against the need for a separate short-term memory were made by Crowder (1993), who pointed out that the rapid forgetting across retention intervals in the distractor paradigm is not found on the first trial of an experiment. He also pointed out that a recency effect like that in immediate free recall is found in a number of tasks relying exclusively on long-term memory. Healy and McNamara (1996) dismissed some of these arguments with two general considerations. First, information
Short-term Memory: Psychological and Neural Aspects can be rapidly encoded in long-term memory, so that, for example, recall can derive from long-term memory rather than short-term memory on the first trial of an experiment before interference from previous trials has degraded the long-term memory representation. Second, although recency effects occur in many memory paradigms, the specific properties and causes of the serial position functions differ across paradigms. Memory models also typically make the distinction between sensory and short-term memory. However, an important exception is Nairne’s ‘feature model’ (Nairne 1990). Instead of distinguishing between these two types of memory stores or processes, the feature model distinguishes between two types of memory trace features—modality independent (which involve speech coding) and modality dependent (which are perceptual but not sensory). According to this model, during recall subjects compare the features of the memory trace to the features of various item candidates. Forgetting occurs as the result of feature overwriting, by which a new feature can overwrite an old feature but only if the two features are of the same type (modality dependent or modality independent). Although short-term memory is distinguished from both sensory and long-term memory by most contemporary models of memory, including a recent ‘composite model’ proposed by Estes (1999) to describe the current state of theorizing, many current models have broken down short-term memory into different subcomponent processes. The most popular of these models is Baddeley’s ‘working memory model’ (Baddeley 1992), which includes an attentional control system called the ‘central executive’ (see Attention: Models) and two ‘slave systems,’ the phonological loop (responsible for speech coding) and the visuospatial sketchpad (responsible for the coding of both spatial and visual information). Recent neuropsychological evidence (Martin and Romani 1994) has led to a further breakdown into semantic and syntactic components in addition to the components of Baddeley’s system. Despite these controversies and complications, it seems clear that the concept of ‘short-term memory’ will continue in the future to play an important role in our theoretical understanding of cognitive processes. See also: Learning and Memory: Computational Models; Learning and Memory, Neural Basis of; Long-term Potentiation (Hippocampus); Memory Models: Quantitative; Short-term Memory: Psychological and Neural Aspects; Visual Memory, Psychology of; Working Memory, Neural Basis of; Working Memory, Psychology of
Bibliography Atkinson R C, Shiffrin R M 1968 Human memory: A proposed system and its control processes. In: Spence K W, Spence J T (eds.) The Psychology of Learning and Motiation: Adances in
Research and Theory 2. Academic Press, San Diego, pp. 89–195 Baddeley A-A. 1992 Is working memory working? The fifteenth Bartlett lecture. Quarterly Journal of Experimental Psychology 44: 1–31 Brown J 1958 Some tests of the decay theory of immediate memory. Quarterly Journal of Experimental Psychology 10: 12–21 Conrad R 1964 Acoustic confusions in immediate memory. British Journal of Psychology 55: 75–84 Craik F I M, Lockhart R S 1972 Levels of processing: A framework for memory research. Journal of Verb Learning and Verb Behaior 11: 671–84 Crowder R G 1993 Short-term memory: Where do we stand? . Memory and Cognition 21: 142–45 Estes W K 1972 An associative basis for coding and organization in memory. In: Melton A W, Martin E (eds.) Coding Processes in Human Memory. Winston, Washington, DC, pp. 161–90 Estes W K 1999 Models of human memory: A 30-year retrospective. In: Izawa C (ed.) On Human Memory: Eolution, Progress, and Reflections on the 30th Anniersary of the Atkinson–Shiffrin Model. Erlbaum, Mahwah, NJ, pp. 59–87 Healy A F, Cunningham T C 1999 Recall of order information: Evidence requiring a dual-storage memory model. In: Izawa C (ed.) On Human Memory: Eolution, Progress, and Reflections on the 30th Anniersary of the Atkinson–Shiffrin Model. Erlbaum, Mahwah, NJ, pp. 151–64 Healy A F, McNamara D S 1996 Verbal learning and memory: Does the modal model still work? Annual Reiew of Psychology 47: 143–72 James W 1890 The Principles of Psychology. Holt, New York Martin R C, Romani C 1994 Verbal working memory and sentence comprehension: A multiple-components view. Neuropsychology 8: 506–23 Nairne J S 1990 A feature model of immediate memory. Memory and Cognition 18: 251–69 Peterson L R, Peterson M J 1959 Short-term retention of individual verbal items. Journal of Experimental Psychology 58: 193–8 Shiffrin R M 1999 30 years of memory. In: Izawa C (ed.) On Human Memory: Eolution, Progress, and Reflections on the 30th Anniersary of the Atkinson–Shiffrin Model. Erlbaum, Mahwah, NJ, pp. 17–33
A. F. Healy
Short-term Memory: Psychological and Neural Aspects 1. Short-term Memory ‘Short-term memory’ refers to a number of systems with limited capacity (in the verbal domain, roughly the ‘magical’ number 7p2 items: Miller 1956) concerned with the temporary retention (in the range of seconds) of a variety of materials. Knowledge of the functional and anatomical organization of short-term memory in humans, and its role in cognition as at the turn of the twenty-first century, is herewith presented, drawing data from three main sources of evidence: (a) 14049
Short-term Memory: Psychological and Neural Aspects behavioral studies in normal individuals and in braininjured patients with selective neuropsychological short-term memory deficits; (b) correlations between the anatomical localization of the cerebral lesion and the short-term memory disorder of brain-damaged patients; (c) correlations between the activation of specific cerebral areas and the execution of tasks assessing short-term retention in normal subjects. The two more extensively investigated aspects of shortterm memory are considered: verbal and visual\ spatial. ‘Short-term memory’ is closely related to the concept of ‘working memory’ (see Working Memory, Psychology of Working Memory, Neural Basis of). The present article focuses on the ‘storage’ and ‘rehearsal’ components of the system, rather than on the cognitive operations and executive functions currently associated with ‘working memory.’ However, the section devoted to the uses of short-term memory illustrates some of the working aspects of short-term retention.
2. Historical Origin of the Construct ‘Short-term Memory’ Suggestions of a distinction between two types of memory, one concerned with temporary retention, and the other having the function of a storehouse for materials which have been laid out of sight, date back at least to John Locke’s ‘Essay Concerning Human Understanding’ (1700). William James (1895) revived the distinction, suggesting the existence of a limitedcapacity ‘primary memory,’ embracing the present and the immediate past, and subserving consciousness (see Consciousness, Neural Basis of). Psychological research in the nineteenth century and in the first half of the twentieth century was, however, mainly concerned with the diverse factors affecting learning and retention, in the context of a basically unitary view of human memory. It was only in the 1950s that shortterm memory became the object of systematic behavioral studies in normal subjects (Baddeley 1976). In the late 1960s the division of human memory into a short- and a long-term system became a current view (Atkinson and Shiffrin 1968).
3. Functional Architecture of Short-term Memory 3.1 Eidence from Normal Subjects Three main behavioral phenomena suggest the existence of a discrete limited-capacity system, concerned with short-term retention (Baddeley 1976, Baddeley 1986, Glanzer 1972). (a) Accuracy in recalling a short list of stimuli (e.g., trigrams of consonants or of words) decreases dramatically in a few seconds if the subjects’ repetition of the memory material (rehearsal) is prevented by a distracting activity such as counting 14050
backwards by threes. (b) In the immediate free recall of a sequence of events, such as words, the final five to six stimuli on the list are recalled better that the preceding ones. This ‘recency effect’ vanishes after a few seconds of distracting activity, and is minimally affected by factors such as age, rate of presentation of the stimuli, and word frequency. (c) In the immediate serial recall of verbal material (memory span), the subject’s performance is affected by factors such as phonological similarity and word length, with the effects of semantic factors being comparatively minor. Each of these phenomena subsequently proved to be considerably more complex than initially thought. They were also interpreted as compatible with a unitary, single-system, view of human memory. This account proved untenable, however, mainly on the basis of neuropsychological evidence. These empirical observations illustrate the main characteristics of ‘short-term memory’: a retention system with limited capacity, where the memory trace, in the time range of seconds, shows a decay, which may be prevented through rehearsal. Material stored in short-term memory has a specific representational format, which, in the case of the extensively-investigated verbal domain, involves phonological codes, separately from lexical-semantic representations stored in long-term memory. The latter contribute, however, to immediate retention, e.g., in verbal span tasks. The functional architecture of phonological shortterm memory has been investigated in detail using effects which break down storage and rehearsal subcomponents. The effect of phonological similarity, whereby the immediate serial recall of auditory and visual verbal material is poorer for sequences of phonologically similar stimuli than for dissimilar ones, reflects the coding which takes place in the phonological short-term store. The effect of word length, whereby the immediate serial recall of auditory and visual verbal material is poorer for sequences of long words than for short ones, is held to reflect the activity of the process of rehearsal, abolished by ‘articulatory suppression,’ i.e., a continuous uttering of an irrelevant speech sound. Suppression, while disrupting rehearsal, also reduces immediate memory performance in span tasks. The interaction between phonological similarity, input modality and articulatory suppression, with suppression abolishing phonological similarity only when the stimuli are presented visually, suggests that rehearsal participates in the process of conveying visual verbal material to the phonological short-term store. Finally, some phonological judgments (such as rhyme and stress assignment) are held to involve the articulatory components of rehearsal, because the performance of normal subjects is selectively impaired by suppression (Burani et al. 1991). In the nonverbal domain a similar distinction is drawn between visual and spatial short- and long-term memory systems, with the relevant representational
Short-term Memory: Psychological and Neural Aspects
Figure 1 Short-term forgetting of a single letter by patient PV after 3 to 15 seconds of delay filled by arithmetic distracting activity. With an auditory input, dramatic forgetting occurred within a few seconds. In the visual modality, the patient’s performance was at ceiling. Immediate recall was fully accurate in both input modalities, ruling out perceptual and response-related deficits, differently from the short-term memory disorder (Source: data from Basso A, Spinnler H, Vallar G, Zanobio M E 1982 Left hemisphere damage and selective impairment of auditory-verbal short-term memory. Neuropsychologia 20: 263–274)
format being in terms of the shape or spatial location of the stimulus (Baddeley 1986, Della Sala and Logie 1993). Also in the visuo-spatial domain, similarity, recency and interference effects have been observed. Visuo-spatial short-term memory is likely to comprise storage and rehearsal (pictorial and spatial) components. In the case of spatial locations, rehearsal may be conceived in terms of planned movements (e.g., ocular, manual reaching, locomotion) towards a target coded in a spatial reference frame, (e.g., egocentric).
3.2 Eidence from Brain-injured Patients Studies in patients with localized brain damage provide unequivocal evidence that supports the independence of short- and long-term memory systems, conjuring up a double dissociation of deficits. In patients with ‘global amnesia,’ which may be characterized as a selective impairment of the declarative or explicit component of long-term memory (see Declaratie Memory, Neural Basis of; Episodic and Autobiographical Memory: Psychological and Neural Aspects) verbal and visuo-spatial short-term memory
are unimpaired, with immediate serial span, the recency effect in immediate free recall, short-term forgetting being within normal limits. This functional dissociation suggests a serial organization of the two systems, with temporary retention being a necessary condition for long-term storage (Atkinson and Shiffrin 1968). Since the late 1960s selective impairments of shortterm memory have been reported (Vallar and Papagno 1995, Vallar and Shallice 1990). The more extensively investigated area concerns auditory-verbal (phonological) short-term memory. Patients show a disproportionate reduction of the auditory-verbal span to an average of less than three items (digits, letters, or words); the recency effect in immediate free recall of auditory-verbal lists of words is abolished; short-term forgetting is abnormally rapid. The disorder is modality-specific, with the level of performance being better when the material is presented visually (Fig. 1). This input-related dissociation has two main implications: (a) discrete phonological and visual short-term memory components exist; (b) in the input– output processing chain, the phonological short-term store should be envisaged as an input system, rather than an output buffer store. This division argues against a monolithic view of the system as a single store, which is amodal, i.e., not specific for the different sensory modalities (see Atkinson and Shiffrin 1968). Additional support to this input locus of the system comes from the observation that in some of these patients speech production is entirely preserved. Patients with defective phonological short-term memory may show unimpaired long-term verbal and visuo-spatial learning (Fig. 2). This observation further corroborates the complete independence of short- and long-term memory systems, but is incompatible with a serial organization, in which defective short-term memory entails a long-term memory impairment (see Atkinson and Shiffrin 1968), suggesting a parallel architecture instead. After early perceptual analysis, information may enter short- or long-term memory, either of which may be selectively disrupted by brain damage. The learning abilities of these patients are, however, dramatically impaired when the phonological material to be learned does not possess pre-existing lexical-semantic representations in long-term memory. This is the case of pronounceable meaningless sequences of letters, such as nonwords, or words of a language unknown to the subject (Fig. 2).This phonological learning deficit indicates that, within a specific representational domain, temporary retention in short-term memory is necessary for stable long-term learning and retention, as predicted by serial models. Other brain-injured patients show deficits of shortterm retention in the visual and spatial domain, even though these patterns of impairment have been explored less extensively than the phonological disorder. Immediate retention of short sequences of spatial 14051
Short-term Memory: Psychological and Neural Aspects
Figure 2 Word (A) and nonword (B) paired-associate learning by patient PV and matched control subjects (C), with auditory presentation. The patient’s performance was within the normal range with native language (Italian) words, dramatically defective with unfamiliar nonwords (Russian words transliterated into Italian) (Source: Baddeley A D, Papagno C, Vallar G 1988 When long-term learning depends on short-term storage. Journal of Memory and Language 27: 586–595)
locations, as assessed by a block tapping task (a spatial analogue of digit span), may be selectively defective, with verbal memory being unaffected, in both its short- and long-term components. In other patients, the deficit may involve the visual (shape) component of short-term memory, while impairments in the shortterm visual recognition of unfamiliar faces, objects, voices, colours have been also described. The disorder of patients with defective visual imagery (as assessed, for instance, by tasks requiring colour or size comparisons) may be interpreted in terms of an impaired visual short-term memory store (see Visual Imagery, Neural Basis of). A deficit of visuo-spatial short-term memory may also disrupt long-term learning of unfamiliar non-verbal material, as assessed by recognition memory for unfamiliar faces and objects. This extends to the visuo-spatial domain the conclusion that long-term acquisition requires short-term storage. The process of rehearsal has been traditionally described as an activity which, through rote repetition, refreshes the short-term memory trace, preventing its decay. The precise characteristics of rehearsal have been elucidated in more detail in recent years, particularly in the domain of phonological memory. Rehearsal may be conceived in terms of the recoding of the memory trace from input (auditory-verbal) to output-related (articulatory) representations and 14052
vice-versa. More specifically, rehearsal of verbal material has long been regarded as ‘articulatory’ in nature, involving output-related verbal processes— such as the motor programming of speech production in a phonological assembly system or output buffer— the actual articulation of the material to be rehearsed (subvocal rehearsal), or both. Anarthric subjects, who are unable to utter any articulated speech sound due to congenital disorders or brain damage acquired in adult age, may, nonetheless, show a preserved immediate memory, including verbal rehearsal. This suggests that the process is ‘central’ and does not require the activity of the peripheral musculature. Brain-damaged patients with a selective impairment of rehearsal (Vallar et al. 1997) show a defective immediate verbal span, as do patients with damage to the phonological short-term store. Both components of phonological memory contribute to immediate retention, though the phonological store provides a major retention capacity. The ability to perform phonological judgments is disrupted, however, by damage to the rehearsal process, but not to the phonological store. By contrast, a defective rehearsal leaves the recency effect in immediate free recall of auditory lists of words largely unimpaired. This recency effect is largely reduced or absent in patients with damage to the phonological short-term store.
Short-term Memory: Psychological and Neural Aspects
4. The Uses of Short-term Memory Is there a use for a system providing the temporary retention of a limited amount of stimuli, besides infrequent situations such as the following? A friend tells us an unfamiliar eight-digit number, which we have to dial on a telephone placed on the other side of a large street and we have no paper and pencil to write it down. The answer is positive. Short-term retention contributes to the stable acquisition of new information in long-term memory. More specifically, phonological short-term memory plays an important role in learning new vocabulary and participates in the processes of speech comprehension and production (see Speech Perception; Speech Production, Psychology of).
4.1 Long-term Learning The observation that patients with defective auditoryverbal span are also impaired in learning unfamiliar pronounceable letter sequences gives rise to the possibility that phonological memory may contribute to a relevant aspect of language development, the acquisition of vocabulary (Fig. 2). Similarly, subjects with a developmental deficit of phonological memory are impaired in vocabulary acquisition and in non-word learning. An opposite pattern is provided by subjects with a congenital cognitive impairment which selectively spares phonological short-term memory. Acquisition of vocabulary, foreign languages and nonword learning are also preserved. Converging evidence from different subject populations supports this view. Correlational studies in children have shown that the capacity of phonological memory is a main predictor of the subsequent acquisition of vocabulary, both in the native and in a second language. In normal adult subjects, the variables which disrupt immediate memory span (phonological similarity, item length, articulatory suppression) also impair the acquisition of non-words. Polyglots have a greater capacity of phonological memory, compared to nonpolyglots, and a better ability to learn novel words. Phonological short-term memory may be considered as a learning device for the acquisition of novel phonological representations, and the building up of the phonological lexicon (Baddeley et al. 1998). A few observations in brain-damaged patients suggest a similar role for visuo-spatial shortterm memory in the acquisition of new visual information, such as unfamiliar faces and objects.
4.2 Language Processing The idea that short-term retention contributes to speech comprehension dates back to the 1960s. Phonological memory may withhold incoming auditory-
verbal strings, while syntactic and lexical-semantic analyses are performed. Patients with defective phonological memory show a preserved comprehension of individual words, as well as many sentential materials, and a normal ability to decide whether or not sentences are grammatically correct. This may reflect, on the one hand, the operation of on-line lexical-semantic processes, heuristics and pragmatics, and, on the other, the complete independence of syntactic and lexical-semantic processes from phonological memory. Patients are however impaired by ‘complex’ sentences, where ‘complexity’ refers to a number of non-mutually-exclusive factors, such as: (a) a high speed of material presentation, which prevents the immediate build-up of an unambiguous cognitive representation; (b) word order conveying meaningcrucial information (e.g., in sentences in which a semantic anomaly is introduced by a change in the linear arrangement of words: ‘The world divides the equator into two hemispheres, the southern and the northern’); (c) extralinguistic presuppositions biasing the interpretation of the spoken message. Under such conditions, adequate interpretation may require backtracking to the verbatim (phonological) representation of the sentence, temporarily held in phonological memory. This provides a ‘backup’ or ‘mnemonic window’ resource for performing supplementary cognitive operations necessary for comprehension (Vallar and Shallice 1990).
5. Neural Architecture of Short-term Memory 5.1 Phonological Short-term Memory Anatomoclinical correlation studies in brain-damaged patients with a selective impairment of the auditory-verbal span indicate that the inferior parietal lobule (supramarginal gyrus) of the left hemisphere, at the temporoparietal junction, represents the main neural correlate of the ‘store’ component of phonological short-term memory (Vallar and Papagno 1995). The frontal premotor regions in the left hemisphere and other structures such as the insula are the major neural correlates of the ‘rehearsal’ component, even though the available anatomoclinical data are more limited (Vallar et al. 1997). Functional neuroimaging studies in normal subjects concur with this pathological evidence, to suggest a left-hemisphere-based network. Activation in the left supramarginal gyrus [Brodmann’s area (BA) 40] is associated with the ‘store’ component of short-term phonological memory, activation in the left frontal premotor BA 44 (Broca’s area) and BA 6, and in the left insula, with the ‘rehearsal’ component (Paulesu et al. 1996). In these studies, in line with the behavioral evidence from normal subjects and patients, an immediate verbal span task activates both the inferior 14053
Short-term Memory: Psychological and Neural Aspects parietal region (phonological short-term store) and the premotor cortex (rehearsal process) in the left hemisphere. Conversely, rhyme judgements selectively activate the left premotor regions, whose damage, in turn, disrupts the patients’ ability to perform this task. These activation and lesion-based data support, from an anatomofunctional perspective, the behavioral distinction between a ‘storage’ component and a ‘rehearsal process’ in phonological short-term memory. Furthermore, they qualify ‘rehearsal’ as a process whichmakes use ofcomponents also concerned with the planning (i.e., programming in the left premotor cortex) of articulated speech. Seen in this perspective, phonological memory may be regarded as a component of the language system. Finally, connectionist modelling of this architecture is currently being developed (Burgess and Hitch 1999).
specific functional properties and discrete neural correlates. These systems secure the retention of a limited amount of material in the time range of seconds and contribute to relevant aspects of cognition, such as long-term learning. See also: Declarative Memory, Neural Basis of; Episodic and Autobiographical Memory: Psychological and Neural Aspects; Learning and Memory, Neural Basis ofMemory Retrieval; Memory: Synaptic Mechanisms; Recognition Memory (in Primates), Neural Basis of; Short-term Memory, Cognitive Psychology of; Working Memory, Neural Basis of
Bibliography 5.2 Visual and Spatial Short-term Memory Studies in brain-damaged patients suggest an association between damage to the posterior regions of the right hemisphere and defective spatial short-term memory, as assessed by tasks requiring the reproduction of sequences of spatial locations. There is also evidence suggesting that damage to the posterior regions of the left hemisphere brings about disorders of visual short-term memory, such as defective immediate retention of sequences of visual stimuli, (e.g. lines), and impaired recognition of more than one visual stimulus at a time (defective simultaneous form perception) (Vallar and Papagno 1995). Neuroimaging activation studies in humans support the distinction between primarily spatial (a location, a ‘where’ component) and visual short-term memory systems (Smith et al. 1995). The main neural correlates of spatial memory for locations include the occipital extra-striate, posterior–inferior parietal, dorsolateral premotor and prefrontal cortices. Short-term visual recognition memory is associated with activations in a network including the posterior–inferior parietal, temporal and ventral premotor and prefrontal cortices. Right hemisphere regions may play a more relevant role in spatial memory for locations, while left hemisphere regions are more involved in visual memory for objects. These patterns of activation indicate an association of the ‘dorsal visual stream’ with spatial short-term memory for locations and of the ‘ventral visual stream’ with short-term visual recognition memory.
6. Conclusion Behavioural observations and neuroanatomical evidence from normal subjects and brain-injured patients concur in suggesting that ‘short-term memory’ should be conceived as a multiple-component system with 14054
Atkinson R C, Shiffrin R M 1968 Human memory: a proposed system and its control processes. In: Spence K W, Taylor Spence J (eds.) The Psychology of Learning and Motiation. Adances in Research and Theory. Academic Press, New York Baddeley A D 1976 The Psychology of Memory. Basic Books, New York Baddeley A D 1986 Working Memory. Clarendon Press, Oxford, UK Baddeley A D, Gathercole S, Papagno C 1998 The phonological loop as a language learning device. Psychological Reiew. 105: 158–73 Burani C, Vallar G, Bottini G 1991 Articulatory coding and phonological judgements on written words and pictures: the role of the phonological output buffer. European Journal of Cognitie Psychology 3: 379–98 Burgess N, Hitch G J 1999 Memory for serial order: A network model of the phonological loop and its timing. Psychological Reiew 106: 551–81 Della Sala S, Logie R H 1993 When working memory does not work: the role of working memory in neuropsychology. In: Boller F, Grafman J (eds.) Handbook of Neuropsychology. Elsevier, Amsterdam Glanzer M 1972 Storage mechanisms in recall. In: Bower G H (ed.) The Psychology of Learning and Motiation. Adances in Research and Theory. Academic Press, New York James W 1895 The Principles of Psychology. Holt, New York Miller G A 1956 The magic number seven, plus or minus two: some limits to our capacity for processing information. Psychological Reiew 63: 81–93 Paulesu E, Frith U, Snowling M, Gallagher A, Morton J, Frackowiak R S J, Frith C D 1996 Is developmental dyslexia a disconnection syndrome? Evidence from PET scanning. Brain 119: 143–57 Smith E E, Jonides J, Koeppe R A, Awh E, Schumacher E H, Minoshima S 1995 Spatial versus object working memory: PET investigations. Journal of Cognitie Neuroscience 7: 337–56 Vallar G, Di Betta A M, Silveri M C 1997 The phonological short-term store-rehearsal system: Patterns of impairment and neural correlates. Neuropsychologia 35: 795–812 Vallar G, Papagno C 1995 Neuropsychological impairments of short-term memory. In: Baddeley A D, Wilson B A, Watts F (eds.) Handbook of Memory Disorders. Wiley, Chichester, UK
Shyness and Behaioral Inhibition Vallar G, Shallice T (eds.) 1990 Neuropsychological Impairments of Short-term Memory. Cambridge University Press, Cambridge, UK
G. Vallar
Shyness and Behavioral Inhibition Since the 1980s, the study of shyness, behavioral inhibition, and social withdrawal in childhood has taken on a research trajectory that can best be described as voluminous. Yet, these related phenomena remain something of a mystery, carrying with them a variety of definitions and a number of very different perspectives concerning psychological significance. Because there appear to be several different characterizations of forms of shy, inhibited, or nonsocial behavior, the first goal of this article is to provide a definitional framework.
1. Shyness, Behaioral Inhibition, and Social Withdrawal: Definitions In their efforts to identify the etiology of children’s personalities and social behaviors, developmental scientists have attempted to determine the relevant dispositional dimensions of temperament that may underlie children’s actions that are displayed consistently across situations, and continuously over time. One such dispositional construct is behaioral inhibition (Kagan 1997). Behavioral inhibition has been defined variously as (a) an inborn bias to respond to unfamiliar events by showing anxiety; (b) a specific vulnerability to the uncertainty all children feel when encountering unfamiliar events that cannot be assimilated easily; and (c) one end of a continuum of possible initial behavioral reactions to unfamiliar objects or challenging social situations. These definitions highlight some common elements: behavioral inhibition is (a) a pattern of responding or behaving, (b) possibly biologically determined, such that (c) when unfamiliar and\or challenging situations are encountered, (d) the child shows signs of anxiety, distress, or disorganization. The term shyness has been used to refer to inhibition in response to novel social situations. In infancy and early childhood, shyness is elicited by feelings of anxiety and distress when confronted by unfamiliar people. To some, such behavior serves an adaptive purpose in that it removes children from situations they perceive as discomforting and dangerous (Buss 1986). But variability of the infantile shyness response is great; some children clearly demonstrate the behavior excessively. In middle childhood, children have the cognitive skill to compare themselves with others and to
understand that others can, and do, pass judgment on them. It is this understanding of social comparison and evaluation that can elicit shy behavior among older children, adolescents and adults (Buss 1986). That is, children may remove themselves from situations they believe will be discomforting because others will pass judgment on their skills, personae, and so on. Again, variability is great; some children display shyness that is clearly out-of-the ordinary. A related construct is social withdrawal, a phenomenon that refers to the consistent (across situations and over time) display of solitary behavior when encountering both familiar and\or unfamiliar peers (Rubin and Stewart 1996). Thus, shyness is embedded within the construct of social withdrawal. The common denominator for inhibition, shyness, and social withdrawal is that the representative behavior is one that moves children away from their social worlds, that is, solitude.
2. Different Forms of Inhibition, Shyness, and Solitude Inhibited behavior in the face of the unfamiliar typically has been assessed by observing very young children as they are confronted by unfamiliar adults, objects, or peers. Insofar as shyness is concerned, it is best exemplified behaviorally by the display of socially reticent behavior during the preschool and early childhood periods. Among unfamiliar others, reticent preschoolers hang back, watching peers from afar; they make few social initiations and back away from initiations made by others. When asked to speak up in groups of unfamiliar peers, reticent children choose not to do so at all, and if they do speak, it is uncomfortably and for a very short period of time. Further, when asked to collaborate with peers to solve problems, they spend their time watching others rather than providing the requested help. Reticence is one of several forms of solitude demonstrated by children (Rubin and Asendorpf 1993). Some preschoolers appear to be satisfied playing with objects rather than people. These children play alone, constructing and exploring objects. The term used to describe such behavior is solitary-passiity (Coplan et al. 1994). When others initiate interaction with solitary-passive children, they do not back away; further, they participate fully and actively in-group problemsolving behavior. Thus unlike shy children who have a high social avoidance motivation, those whose solitary behavior is mainly of the solitary-passive ilk may be described as having a low social approach motivation. Still yet other children demonstrate solitude that appears to reflect immaturity. These young children engage in solitary-sensorimotor activity (solitaryactie play), repeating motor movements with or without objects (Coplan et al. 1994). 14055
Shyness and Behaioral Inhibition 2.1 Deelopmental Differences and Change Typically all forms of solitude decrease from early through middle and late childhood. As noted above, causes of shy behavior change from early wariness to the anxiety associated with being socially evaluated. While reticent behavior tends to decrease with age, it is also the case that the ‘meanings’ of different forms of solitude change as well. For example, as children come to cope with their social anxieties among unfamiliar peers, they become increasingly likely to display that form of solitude that had been viewed earlier as normal and adaptive—solitary-exploration and construction. Thus, with increasing age, such constructive solitude becomes increasingly associated with measures of social wariness, and physiological markers of anxiety and emotion dysregulation (see below).
3. Putatie Causes of Inhibition, Shyness, and Social Withdrawal 3.1 Physiology Behaioral inhibition has been thought to emanate from a physiological ‘hard wiring’ that evokes caution, wariness, and timidity in unfamiliar social and nonsocial situations. Inhibited infants and toddlers differ from their uninhibited counterparts in ways that imply variability in the threshold of excitability of the amygdala and its projections to the cortex, hypothalamus, sympathetic nervous system, corpus striatum, and central gray. That there is a physiological basis underpinning behavioral inhibition is drawn from numerous psychophysiological studies. For example, stable patterns of right frontal EEG asymmetries in infancy predict temperamental fearfulness and behavioral inhibition in early childhood (Calkins et al. 1996). The functional role of hemispheric asymmetries in the regulation of emotion may be understood in terms of an underlying motivational basis for emotional behavior, specifically along the approach–withdrawal continuum. Infants exhibiting greater relative right frontal asymmetry are more likely to withdraw from mild stress, whereas infants exhibiting the opposite pattern of activation are more likely to approach. Another physiological entity that distinguishes wary from nonwary infants and toddlers is vagal tone, an index of the functional status or efficiency of the nervous system, marking both general reactivity and the ability to regulate one’s level of arousal. Reliable associations have been found between vagal tone and inhibition in infants and toddlers: children with lower agal tone (consistently high heart rate due to less parasympathetic influence) tend to be more behaviorally inhibited (Kagan et al. 1987). In early childhood, reticent behavior is associated with the same physiological markers as is the case for 14056
behavioral inhibition in infancy and toddlerhood. Thus, in early childhood, reticent, fearful, solitary behavior is associated with greater relative right frontal EEG activation; but constructive solitude is not (Fox et al. 1996). Further, parents view children who have such EEG asymmetries as anxious. Among older, elementary school-age children, shy, reticent behavior among familiar peers (i.e., social withdrawal) has been associated positively with internalized negative emotions such as nervousness, distress, and upset; and negatively related to positive emotions such as enthusiasm and excitement.
3.2 Parent–Child Relationships Attachment theorists maintain that the primary relationship develops during the first year of life, usually between the mother and the infant (Bowlby 1973). Maternal sensitivity and responsiveness influence whether the relationship will be secure or insecure. Researchers have shown that securely attached infants are likely to be well adjusted, socially competent, and successful at forming peer relationships in early and middle childhood whereas insecurely attached children may be less successful at social developmental tasks. Researchers have proposed that those infants who are temperamentally reactive and who receive insensitive, unresponsive parenting come to develop an insecure-ambivalent (‘C’-type) attachment relationship with their primary caregivers (e.g., Calkins and Fox 1992). In novel settings these ‘C’ babies maintain close proximity to the attachment figure (usually the mother). When the mother leaves the room briefly, these infants become quite unsettled. Upon reunion with the mother, these infants show angry, resistant behaviors interspersed with proximity—or contactseeking behaviors. It is argued further that this constellation of infant emotional hyperarousability and insecure attachment may lead the child to display inhibited\wary behaviors as a toddler, and there are data supportive of this conjecture. Given that the social behaviors of preschoolers and toddlers who have an insecure ‘C’-type attachment history are thought to be guided largely by fear of rejection, it is unsurprising to find that when these children are observed in peer group settings, they appear to avoid rejection by demonstrating passive, adult-dependent behavior and withdrawal from social interaction. Lastly, ‘C’ babies lack confidence and assertiveness at age 4 years; then, at age 7 years they are seen as passively withdrawn (Renken et al. 1989).
3.3 Parenting Recently, researchers have shown that parental influence and control maintain and exacerbate child
Shyness and Behaioral Inhibition inhibition and social withdrawal. For example, mothers of extremely inhibited toddlers have been observed to display overly solicitous behaviors (i.e., intrusively controlling, unresponsive, physically affectionate). Mothers of shy preschoolers do not encourage independence and exploration. And mothers of socially withdrawn children tend to be highly controlling, overprotective, and poor reciprocators of their child’s displays of positive behavior and positive affect. Lastly, researchers have shown that mothers of socially withdrawn children are more likely than those of normal children to use such forms of psychological control statements, as devaluation, criticism, and disapproval. Taken together, these parenting practices may attack the child’s sense of self-worth (for a review, see Rubin et al. 1995).
4. Correlates and Consequences of Inhibition, Shyness, and Social Withdrawal Researchers who have followed, longitudinally, the course of development for inhibited infants have found strong consistency of behavior over time. As a group, children identified as extremely inhibited are more likely to be socially wary with unfamiliar peers in both the laboratory and at school, and to exhibit signs of physiological stress during social interactions (Kagan et al. 1987). In a longitudinal study extending into adulthood, Caspi and Silva (1995) found that individuals identified as shy, fearful, and withdrawn at 3 years reported that they preferred to stick with safe activities, be cautious, submissive, and had little desire to influence others at 18 years. A subsequent follow-up at age 21 on interpersonal functioning showed that these same children were normally adjusted in both their work settings and their romantic relationships. Social withdrawal appears to carry with it the risk of a child’s developing negative thoughts and feelings about the self. Highlighting the potential long-term outcomes of social withdrawal is a recent report which showed that social withdrawal among familiar peers in school at age 7 years predicted negative self-perceived social competence, low self-worth, loneliness, and felt peer-group insecurity among adolescents aged 14 years (Rubin et al. 1995). These latter findings are augmented by related research findings. For example, Renshaw and Brown (1993) found that passive withdrawal at age 9 to12 years predicted loneliness assessed one year later. Ollendick et al. (1990) reported that 10year-old socially withdrawn children were more likely to be perceived by peers as withdrawn and anxious, more disliked by peers, and more likely to have dropped out of school than their well-adjusted counterparts five years later. Finally, Morison and Masten (1991) indicated that children perceived by peers as withdrawn and isolated in middle childhood were more likely to think negatively of their social compe-
tencies and relationships in adolescence. In sum, it would appear as if early social withdrawal, or its relation to anxiety, represents a behavioral marker for psychological and interpersonal maladaptation in childhood and adolescence.
5. Summary and Future Directions The study of inhibition and shyness garnered an enormous amount of attention in the 1990s. Most empirical research has focused on the contemporaneous and predictive correlates of social reticence, shyness, and withdrawal at different points in childhood and adolescence. These correlated variables include those of the biological, intrapersonal, interpersonal, and psychopathology ilk that have been chosen from conceptual frameworks pertaining to the etiology, stability, and outcomes of socially wary and withdrawn behaviors. Thus far, it appears that socially inhibited children have a biological disposition that fosters emotional dysregulation in the company of others. These children, if overly directed and protected by their primary caregiver, become reticent and withdrawn in the peer group. In turn, such behavior precludes the development of social skills and the initiation and maintenance of positive peer relationships. Yet again, this transactional experience seems to lead children to develop anxiety, loneliness, and negative self-perceptions of their relationships and social skills. Despite these strong conclusions, however, it is important to recognize that the data bases upon which these conclusions rest are relatively few. Clearly, replication work is necessary. The extent to which dispositional factors interact with parenting styles and parent–child relationships to predict the consistent display of socially withdrawn behavior in familiar peer contexts still needs to be established. Further, the sex differences discussed above require additional attention. Lastly, our knowledge of the developmental course of inhibition, shyness, and social withdrawal is constrained by the almost sole reliance on data gathered in Western cultures. Little is known about the developmental course of these phenomena in Eastern cultures such as those in China, Japan, or India; and even less is known in Southern cultures such as those found in South America, Africa, and southern Europe. It may well be that depending on the culture within which these phenomena are studied, the biological, interpersonal, and intrapersonal causes, concomitants, and consequences of inhibition, shyness, and social withdrawal may vary. In short, cross-cultural research is necessary, not only for the study of these phenomena, but also for most behaviors that are viewed as deviant or reflective of intrapsychic abnormalities in the West. 14057
Shyness and Behaioral Inhibition See also: Attachment Theory: Psychological; Emotional Inhibition and Health; Personality and Social Behavior; Personality Development and Temperament; Temperament and Human Development
Rubin K H, Stewart S, Chen X 1995 Parents of aggressive and withdrawn children. In: Bornstein M H (ed.) Handbook of Parenting: Children and Parenting. L. Erlbaum Associates, Mahwah, NJ, Vol. 1, pp. 225–84
K. H. Rubin
Bibliography Bowlby J 1973 Attachment and Loss. Attachment. Basic Books, New York, Vol. 1 Buss A H 1986 A theory of shyness. In: Jones W H, Cheek J M, Briggs S R (eds.) Shyness: Perspecties on Research and Treatment. Plenum, New York Calkins S D, Fox N A 1992 The relations among infant temperament, security of attachment and behavioral inhibition at 24 months. Child Deelopment 63: 1456–72 Calkins S D, Fox N A, Marshall T R 1996 Behavioral and physiological antecedents of inhibited and uninhibited behavior. Child Deelopment 67: 523–40 Caspi A, Silva P A 1995 Temperamental qualities at age three predict personality traits in young adulthood: Longitudinal evidence from a birth cohort. Child Deelopment 66: 486–98 Coplan R J, Rubin K H, Fox N A, Calkins S D, Stewart S L 1994 Being alone, playing alone, and acting alone: Distinguishing among reticence, and passive- and active-solitude in young children. Child Deelopment 65: 129–37 Fox N A, Schmidt L A, Calkins S D, Rubin K H, Coplan R J 1996 The role of frontal activation in the regulation and dysregulation of social behavior during the preschool years. Deelopment and Psychopathology 8: 89–102 Kagan J 1997 Temperament and the reactions to unfamiliarity. Child Deelopment 68: 139–43 Kagan J, Reznick J S, Snidman N 1987 The physiology and psychology of behavioral inhibition in children. Child Deelopment 58: 1459–73 Morison P, Masten A S 1991 Peer reputation in middle childhood as a predictor of adaptation in adolescence: A seven-year follow-up. Child Deelopment 62: 991–1007 Ollendick T H, Greene R W, Weist M D, Oswald D P 1990 The predictive validity of teacher nominations: A five-year followup of at risk youth. Journal of Abnormal Child Psychology 18: 699–713 Renken B, Egeland B, Marvinney D, Mangelsdorf S, Sroufe L 1989 Early childhood antecedents of aggression and passivewithdrawal in early elementary school. Journal of Personality 57: 257–81 Renshaw P D, Brown P J 1993 Loneliness in middle childhood: Concurrent and longitudinal predictors. Child Deelopment 64: 1271–84 Rubin K H, Asendorpf J 1993 Social withdrawal, inhibition and shyness in childhood: Conceptual and definitional issues. In: Rubin K H, Asendorpf J (eds.) Social Withdrawal, Inhibition, and Shyness in Childhood. L. Erlbaum Associates, Hillsdale, NJ Rubin K H, Chen X, McDougall P, Bowker A, McKinnon J 1995 The Waterloo Longitudinal Project: Predicting internalizing and externalizing problems in adolescence. Deelopment and Psychopathology 7: 751–64 Rubin K H, Stewart S L 1996 Social withdrawal. In: Mash E J, Barkley R A (eds.) Child Psychopathology. Guilford, New York, pp. 277–307
14058
Sibling-order Effects Historically, sibling order has influenced important aspects of social, economic, and political life, and it continues to do so today in many traditional societies. Discriminatory inheritance laws and customs about royal succession that favor firstborns and eldest sons are just two examples. Sibling-order effects have also been documented for a wide variety of behavioral tendencies, although the magnitude and causal interpretation of these effects have been subject to debate. In the Darwinian process of competing for parental favor, siblings often employ strategies that are shaped by their order of birth within the family, and these strategies exert a lasting impact on personality. Radical revolutions are particularly likely to elicit differences in support by birth order. These behavioral patterns appear to be mediated by differences in personality as well as by differing degrees of identification with the family.
1. Social, Economic, and Political Consequences Many societies—especially in past centuries and in non-Western parts of the world—have engaged in practices that favor one sibling position over another. For example, most traditional societies permit infanticide, especially when a child has a birth defect or when an immediately older infant is still breastfeeding. However, no society condones the killing of the elder of two siblings (Daly and Wilson 1988). An extensive survey of birth order and its relationship to social behavior in 39 non-Western societies found that the birth of a first child typically increases the status of parents and stabilizes the marriage. In addition, firstborns were generally found to receive more elaborate birth ceremonies, were given special privileges, and tended to exert authority over their younger siblings. In most of these 39 societies, firstborns maintained supremacy over their younger siblings even in adulthood. They also gained control of a greater share of parental property than laterborns, were more likely to become head of their local kin group, and tended to exert greater power in relationships with nonfamily members (Rosenblatt and Skoogberg 1974).
Sibling-order Effects Previously, many Western societies employed sibling order as a means of deciding who inherits parental property or assumes political power. Primogeniture (the policy by which the eldest child or son automatically inherits property or political authority) was the most commonly employed mechanism, but other discriminatory inheritance practices have also been employed. For example, secondogeniture involves leaving the bulk of parental property to the second child or second son, and ultimogeniture involves leaving such property to the lastborn or youngest son. Most variations in inheritance practices by sibling order can be understood by considering the advantages that accrue from such policies, given local economic circumstances (Hrdy and Judge 1993). For example, primogeniture has generally been practiced in societies where wealth is stable and based on limited land, and where talent does not matter much. Leaving the bulk of parental property to a single offspring avoids subdividing estates and hence reducing the social status of the family patronymic—something that was particularly important in past centuries among the landed aristocracy. However, in Renaissance Venice economic fortunes were generally based on speculative commerce rather than ownership of property, and parents typically subdivided their estates equally so as to maximize the chances of having one or more commercially successful offspring (Herlihy 1977). Ultimogeniture is a policy often found in societies that impose high death taxes on property. This inheritance practice has the consequence of maximizing the interval between episodes of taxation, thus reducing the overall tax burden. Even in societies employing inheritance systems that favor one sibling over others, parents have commonly provided more or less equally for their offspring by requiring the child who inherits the family estate to pay a compensatory amount to each sibling. Primogeniture and related practices did not always mean disinheritance—a common misassumption. In medieval and early modern times, however, primogeniture among the landed aristocracy did mean that some younger sons and daughters faced difficult economic and social prospects. Among the medieval Portuguese nobility, for example, landless younger sons and daughters were significantly less likely to marry and leave surviving offspring (Boone 1986). Younger sons, for example, left 1.6 fewer children than did eldest sons. Younger sons were also nine times more likely than eldest sons to father a child out of wedlock. Because they posed a serious threat to political stability in their own country, younger sons were channeled by the state into expansionist military campaigns in faraway places such as India, where they often died in battle or from disease. Similarly, the Crusades can be seen, in part, as a church-state response to this domestic danger (Duby 1977). The surplus of landless younger daughters in the titled nobility was dealt with by sending them to nunneries.
2. Personality Birth-order differences have long been claimed in the domain of personality, although these claims have remained controversial despite considerable research on this topic. Psychologists have investigated the consequences of birth order ever since Charles Darwin’s cousin Francis Galton (1874) reported that eldest sons were overrepresented as members of the Royal Society. After breaking away from Sigmund Freud in 1910 to found a variant school of psychoanalysis, Alfred Adler (1927) focused on birth order in his own attempt to emphasize the importance of social factors in personality development. A secondborn, Adler considered firstborns to be ‘power-hungry conservatives.’ He characterized middleborns as competitive and lastborns as spoiled and lazy. Since Adler speculated about birth order and its consequences for personality in 1927, psychologists have conducted more than 2000 studies on the subject. Critics of this extensive literature have argued that most of these studies are inadequately controlled for key covariates, such as social class and sibship size; that studies often conflict; and that birth-order differences in personality and IQ seem to have been overrated (Ernst and Angst 1983). Meta-analysis—a technique for aggregating findings in order to increase statistical power and reliability—suggests a different conclusion. If we consider only those well-designed studies controlling for important background variables that covary with birth order and can introduce spurious cross-correlations, a meta-analytic review reveals consistent birth-order differences for a wide range of personality traits (Sulloway 1995). For instance, firstborns are generally found to be more conscientious than laterborns, a difference that is exemplified by their being more responsible, ambitious, and self-disciplined. In addition, firstborns tend to be more conforming to authority and respectful of parents. Firstborns also tend to have higher IQs than their younger siblings—a reduction of about one IQ point is observed, on average, with each increment in birth rank (Zajonc and Mullally 1997). These and other birth-order differences in personality can be usefully understood from a Darwinian perspective on family life (Sulloway 1996). Because they share, on average, only half of their genes, siblings will tend to compete unless the benefits of cooperating are greater than twice the costs (Hamilton 1964). In this Darwinian story about sibling competition, birth order does not exert a direct biological influence on personality. For example, there are no genes for being a firstborn or a laterborn. Instead, birth order is best seen as a proxy for various environmental influences, particularly disparities in age, physical size, power, and status within the family. These physical and mental differences lead siblings to pursue alternative strategies in their efforts to maximize parental investment (which includes emotional as well as physical 14059
Sibling-order Effects resources). This perspective on how sibling order shapes personality accords with research in behavioral genetics, which finds that most environmental influences on personality are not shared by siblings and hence belong to the ‘nonshared environment’ (Plomin and Daniels 1987). Prior to the twentieth century, half of all children did not survive childhood. Even minor differences in parental favor would have increased an offspring’s chances of getting out of childhood alive. Because eldest children have already survived the perilous early years of childhood, they are more likely than younger siblings to reach the age of reproduction and to pass on their parents’ genes. Quite simply, a surviving eldest child was generally a better Darwinian bet than a younger child, which is why parents in most traditional societies tend to bias investment, consciously or unconsciously, toward older offspring. An exception to this generalization involves youngest children born toward the end of a woman’s reproductive career. These children also tend to be favored, since they cannot be replaced (Salmon and Daly 1998). Even when parents do not favor one offspring over another, siblings compete to prevent favoritism. Siblings do so, in part, by cultivating family niches that correspond closely with differences in birth order. Firstborns, for example, tend to act as surrogate parents toward their younger siblings, which is a good way of currying parental favor. As a consequence, firstborns tend to be more conservative and parentidentified than their younger siblings, which is one of the most robustly documented findings in the literature. Laterborns cannot compete as effectively as firstborns for this surrogate parent niche, since they cannot baby-sit themselves. As a consequence, laterborns seek alternative family niches that will help them to cultivate parental favor in different ways. To do so, they must often look within themselves for latent talents that can only be discovered through systematic experimentation. Toward this end, laterborns tend to be more open to experience—that is, more imaginative, prone to fantasies, and unconventional— propensities that offer an increased prospect of finding a valued and unoccupied family niche. In addition to their efforts to cultivate alternative family niches, siblings diverge in personality and interests because they employ differing strategies in dealing with one another. These strategies are similar to those observed in mammalian dominance hierarchies. Because firstborns are physically bigger than their younger siblings, they are more likely to employ physical aggression and intimidation in dealing with rivals. Firstborns are the ‘alpha males’ of their sibling group, and they generally boss and dominate their younger brothers and sisters. For their own part, laterborns tend to employ low-power strategies to obtain what they want, including pleading, whining, cajoling, humor, social intelligence, and, whenever expedient, appealing to parents for assistance. 14060
Laterborns also tend to form coalitions with one another in an effort to circumvent the physical advantages enjoyed by the firstborn. Middle children are the most inclined to employ diplomatic and cooperative strategies. Some middle children are particularly adept at nonviolent methods of protest. Martin Luther King, Jr., the middle of three children, began his career as a champion of nonviolent reform by interceding in the episodes of merciless teasing that his younger brother inflicted upon their elder sister. Only children represent a controlled experiment in birth-order research. They have no siblings and therefore experience no sibling rivalry. As a consequence, they are not driven to occupy a particular family niche. Although only children, like other firstborns, are generally ambitious and conform to parental authority, they are intermediate between firstborns and laterborns on most other personality traits. Because age spacing can affect functional sibling order, a firstborn whose next younger sibling is six or more years younger is effectively like an only child. Similarly, large age gaps can make some laterborns into functional firstborns or only children. That brothers and sisters differ from one another for strategic reasons, rather than randomly, has been shown by studies involving more than one sibling from the same family (Schachter 1982). In one study, firstborns were found to be significantly different in personality and interests from secondborns, who were significantly different from thirdborns. By contrast, the first and third siblings were not as different as adjacent pairs, presumably because of less competition. This process of sibling differentiation has been termed ‘deidentification’ and extends to relationships with parents. For example, a firstborn child sometimes identifies more strongly with one parent than another. In such cases, the secondborn tends to identify more strongly with the parent that is not already preferred by the firstborn. The most compelling evidence for birth-order effects in personality comes from studies in which siblings assess each other’s personalities (Paulhus et al. 1999, Sulloway 1999). Such within-family designs control for the kinds of spurious correlations that can result from comparing individuals from different family backgrounds. Studies based on such direct sibling comparisons exhibit consistent birth-order effects in the expected direction. When the results are organized according to the Five-Factor Model of personality (Costa and McCrae 1992), firstborns tend to be more conscientious and slightly more neurotic than laterborns, whereas laterborns tend to be more agreeable, extraverted, and open to experience than firstborns. Although reasonably consistent patterns for birth order and personality are observed according to the Five-Factor Model of personality, findings are significantly heterogeneous for three of the five personality dimensions. For example, firstborns tend to be more assertive than laterborns, which is an in-
Sibling-order Effects dicator of extraversion, but laterborns tend to score higher on most other facets of extraversion, which include being more fun-loving, sociable, and excitement seeking. Similarly, laterborns tend to be more open to experience in the sense of being unconventional, whereas firstborns tend to be more open to experience in the sense of being intellectually oriented. Lastly, firstborns are more neurotic in the sense of being anxious about status, but laterborns are more neurotic in the sense of being more self-conscious. As measured by direct sibling comparisons within the family, birth-order differences explain about four percent of the variance in personality, less than does gender and substantially more than do age, family size, or social class. In studies controlled for differences in age, sex, social class, and sibship size, siblings are about twice as likely to exhibit traits that are consistent with their sibling positions as to exhibit inconsistent traits. In short, not every laterborn has a laterborn personality (just as some firstborns deviate from the expected trend), but a reasonably consonant pattern is nevertheless present. One still-unresolved question about birth order and personality is the extent to which within-family patterns of behavior transfer to behavior outside the family of origin. Recent studies suggest that birthorder effects observed in extrafamilial contexts are about one-third to one-half the magnitude of those manifested within the family (Sulloway 1999). Relative to firstborn spouses and roommates, for example, laterborn spouses and roommates are generally perceived to be more agreeable and extraverted, but less conscientious and neurotic. Outside the family of origin, birth-order effects seem to manifest themselves most strongly in intimate living relationships and in dominance hierarchies. These findings are not surprising, since they involve the kind of behavioral contexts in which sibling strategies were originally learned.
3. Behaior During Radical Reolutions Some of the best evidence for sibling differences in behavior outside the family of origin comes from social and intellectual history. During socially radical revolutions, laterborns have generally been more likely than firstborns to question the status quo and to adopt a revolutionary alternative. This difference is observed even after controlling for the fact that laterborns, in past centuries at least, appear to have been more socially liberal than firstborns. As a rule, these effects are very context sensitive. During the Protestant Reformation, for example, laterborns were nine times more likely than firstborns to suffer martyrdom in support of the Reformed faith. (This statistic is corrected for the greater number of laterborns in the general population.) In countries that turned Protestant, such as England, firstborns were five times
more likely than laterborns to suffer martyrdom for their refusal to abandon their allegiance to the Catholic faith. Thus laterborns gave their lives in the service of radical rebellion, whereas firstborns did so in an effort to preserve the waning orthodoxy. There was no generalized tendency, however, for laterborns to become ‘martyrs’ (Sulloway 1996). Similarly, during the French Revolution firstborn deputies to the National Convention generally voted according to the expectations of their social class, whereas laterborn deputies were more likely to vote in the opposite manner. For this reason, there was only a modest tendency for firstborns and laterborns to vote in a particular manner, since voting was generally influenced by other important considerations that determined what it meant to conform or rebel in a rapidly changing political environment. In response to radical conceptual changes in science one finds similar differences by birth order. Firstborns tend to be the guardians of what Thomas Kuhn (1970) has called ‘normal’ science, during which research is guided by the prevailing paradigms of the day. By contrast, laterborns tend to be the outlaws of science, sometimes flaunting its accepted methods and assumptions in the process of attempting radical conceptual change. During the early years of the Copernican revolution, which challenged church doctrine by claiming that the earth rotates around the sun, laterborns were five times more likely than firstborns to endorse this heretical theory. Copernicus himself was the youngest of four children. When Charles Darwin, the fifth of six children, became an evolutionist in the 1830s, he was 10 times more likely to do so than a firstborn. During other notable revolutions in science—including those led by laterborns such as Bacon, Descartes, Hutton, Semmelweis, and Heisenberg, and by occasional firstborns such as Newton and Einstein—younger siblings have typically been twice as likely as firstborns to endorse the new and radical viewpoint. Conversely, when new scientific doctrines, such as vitalism and eugenics, have appealed strongly to social conservatives, firstborns have been more likely than laterborns to pioneer and endorse such novel but ideologically conservative ideas. These birthorder effects typically fade over the course of scientific progress, as conceptual innovations accumulate supporting evidence and eventually convince the scientific majority. In spite of their predilection for supporting radical innovations, laterborns have no monopoly on scientific truth. They were nine times more likely than firstborns to support Franz Joseph Gall’s bogus theory of phrenology, which held that character and personality could be read by means of bumps on the head. In their willingness to question the status quo, laterborns run the risk of error through over-eager rebellion, just as firstborns sometimes err by resisting valid conceptual changes until the mounting evidence can no longer be denied. 14061
Sibling-order Effects During ‘normal’ science, firstborns possess a slight advantage over laterborns. Being more academically successful, firstborns are more likely than laterborns to become scientists. They also tend to win more Nobel prizes. This finding might seem surprising, but, according to the terms of Nobel’s will, Nobel prizes have generally been given for ‘discoveries’ or creative puzzle solving in science, not for radical conceptual revolutions. Firstborn scientists who have innovated within the system include James Watson and Francis Crick, who together unraveled the structure of DNA, and Jonas Salk, who developed the polio vaccine. Such historical facts highlight another important point, namely, that firstborns and laterborns do not differ in overall levels of ‘creativity.’ Rather, firstborns and laterborns are preadapted to solving dissimilar kinds of problems by employing disparate kinds of creative strategies.
4. Family Sentiments The kinds of birth-order effects that are observed during radical historical revolutions may depend as much on differences in ‘family sentiments’ as they do on personality. As Salmon and Daly (1998) have shown, firstborns (and to a lesser extent lastborns) are more strongly attached to the family system than are middle children. Historically, radical revolutions have tapped differences in family sentiments in two important ways. First, being considerably older than their offspring, parents were more likely to endorse the status quo, given that age is one of the best predictors of the acceptance of new and radical ideas (Hull et al. 1978). For this reason, endorsing a radical revolution has usually meant opposing parental values and authority, something that firstborns (and to a lesser extent lastborns) are less likely to do than are middleborns. Second, most radical revolutions, even in fields such as science, have tended to raise issues that are directly relevant to issues of parental investment and discrimination among offspring, filial loyalty, and overall identification with the family system. During the Reformation, for example, Protestant leaders such as Martin Luther strongly advocated the right to marriage by previously celibate clergymen and nuns, who tended to be younger sons and daughters. These Protestant leaders also advocated the egalitarian principles of partible inheritance, by which property—and even political rule—were subdivided equally among offspring (Fichtner 1989). During the height of the controversies raging over Darwin’s theory of evolution, Darwin noted that discriminatory inheritance practices posed a serious impediment to human evolution, remarking to Alfred Russel Wallace: ‘But oh, what a scheme is primogeniture for destroying natural selection!’ (Sulloway 1996, p. 54). Even when radical ideology does not play a central role in conceptual change, and when discriminatory 14062
inheritance systems are also not a factor, differences in identification with the family can impact on social and political attitudes. As Salmon (1998) has shown experimentally, political speeches appeal differentially to individuals by birth order depending on whether kinship terms are included in the speech. Firstborns and lastborns are more likely to react positively to speeches that employ kinship terms such as ‘brother’ and ‘sister,’ whereas middle children prefer speeches that employ references to ‘friends.’ To the extent that radical social movements make use of kinship terms, which they often do, birth-order differences in family sentiments will tend to influence how siblings react to radical change.
5. Conclusion In past centuries birth order and functional sibling order have influenced such diverse social and political phenomena as royal succession, expansionist military campaigns, religious wars and crusades, geographic exploration, and inheritance laws. Even today, birth order and family niches more generally are among the environmental sources of personality because they cause siblings to experience the family environment in dissimilar ways. In particular, birth order introduces the need for differing strategies in dealing with sibling rivals as part of the universal quest for parental favor. This is a Darwinian story, albeit one with a marked environmental twist. Although siblings appear to be hard-wired to compete for parental favor, the specific niche in which they have grown up determines the particular strategies they adopt within their own family. Finally, because birth order and family niches underlie differences in family sentiments, including filial loyalty and self-conceptions about family identity, the family has historically supplied a powerful engine for revolutionary change. See also: Family, Anthropology of; Family as Institution; Family Processes; Family Size Preferences; Family Systems and the Preferred Sex of Children; Galton, Sir Francis (1822–1911); Personality Development in Childhood; Property: Legal Aspects of Intergenerational Transmission; Sibling Relationships, Psychology of
Bibliography Adler A 1927 Understanding Human Nature. Greenberg, New York Boone J L 1986 Parental investment and elite family structure in preindustrial states: case study of late medieval-early modern Portuguese genealogies. American Anthropologist 88: 859–78 Costa P T, McCrae R R 1992 NEO PI-R Professional Manual. Psychological Assessment Resources, Odessa, FL Daly M, Wilson M 1988 Homicide. Aldine de Gruyter, Hawthorne, New York
Sibling Relationships, Psychology of Duby G 1977 The Chialrous Society. Edward Arnold, London Ernst C, Angst J 1983 Birth Order: Its Influence on Personality. Springer-Verlag, Berlin Fichtner P S 1989 Protestantism and Primogeniture in Early Modern Germany. Yale University Press, New Haven, CT Galton F 1874 English Men of Science. Macmillan, London Hamilton W 1964 The genetical evolution of social behavior. Parts 1 and II. Journal of Theoretical Biology 7: 1–52 Herlihy D 1977 Family and property in Renaissance Florence. In: Herlihy D, Udovitch A L (eds.) The Medieal City. Yale University Press, New Haven, CT, pp. 3–24 Hrdy S, Judge D S 1993 Darwin and the puzzle of primogeniture. Human Nature 4: 1–45 Hull D L, Tessner P D, Diamond A M 1978 Planck’s principle. Science 202: 717–23 Kuhn T S 1970 The Structure of Scientific Reolutions, 2nd edn. University of Chicago Press, Chicago Paulhus D L, Chen D, Trapnell P D 1999 Birth order and personality within families. Psychological Science 10: 482–8 Plomin R, Daniels D 1987 Why are children in the same family so different from one another. Behaioral and Brain Sciences 10: 1–60 Rosenblatt P C, Skoogberg E L 1974 Birth order in crosscultural perspective. Deelopmental Psychology 10: 48–54 Salmon C 1998 The evocative nature of kin terminology in political rhetoric. Politics and the Life Sciences 17: 51–7 Salmon C A, Daly M 1998 Birth order and familial sentiment: Middleborns are different. Human Behaior and Eolution 19: 299–312 Schachter F F 1982 Sibling deidentification and split-parent identifications: A family tetrad. In: Lamb M E, Sutton-Smith B (eds.) Sibling Relationships: Their Nature and Significance across the Lifespan. Lawrence Erlbaum, Hillsdale, NJ, pp. 123–52 Sulloway F J 1995 Birth order and evolutionary psychology: A meta-analytic overview. Psychological Inquiry 6: 75–80 Sulloway F J 1996 Born to Rebel: Birth Order, Family Dynamics, and Creatie Lies. Pantheon, New York Sulloway F J 1999 Birth order. In: Runco M A, Pritzker S (eds.) Encyclopedia of Creatiity 1: 189–202 Zajonc R B, Mullally P R 1997 Birth order: Reconciling conflicting effects. American Psychologist 52: 685–99
F. J. Sulloway
Sibling Relationships, Psychology of Siblings—brothers and sisters—have a key place in legends, history and literature throughout the world, from the era of the Egyptians and Greeks onward. The great majority of children (around 80 percent in Europe and the USA) grow up with siblings, and for most individuals their relationships with their siblings are the longest-lasting in their lives. Scientific study of the psychology of siblings is relatively recent, but is fast-growing, centering chiefly on studies of childhood and adolescence. The scientific interest of siblings lies, in particular, in the following domains: the nature and the potential influence of siblings on each other’s development and adjustment, the illuminating per-
spective the study of siblings provides on developmental issues, and the challenge that siblings present to our understanding of how families influence development—why siblings differ notably in personality and adjustment even though they grow up within the same family, the significance of their shared and separate family experiences, and their genetic relatedness.
1. The Term ‘Siblings’ This term is usually applied to ‘full’ siblings, brothers and sisters who are the offspring of the same mother and father, and share 50 percent of their genes; however, with the changes in family structure during the later decades of the twentieth century, increasing numbers of children have relationships with ‘half siblings,’ children with whom they share one biological parent, and ‘step-siblings,’ children who are unrelated biologically.
2. The Nature of Sibling Relationships 2.1 Characteristics of the Relationship The relationship between siblings is one that is characterized by distinctive emotional power and intimacy from infancy onward. It is a relationship that offers children unique opportunities for learning about self and other, with considerable potential for affecting children’s well-being, intimately linked as it is with each child’s relationship with the parents. Clinicians and family systems theorists have, from early in the twentieth century, expressed interest in the part siblings play in family relationships, and in the adjustment of individuals. However, until the 1980s there was relatively little systematic research on siblings (with the notable exception of the classic studies of birth order by Koch 1954). In the 1980s and 1990s, research interest broadened greatly to include the investigation of sibling developmental influences, sources of individual differences between siblings, and links between family relationships (Boer and Dunn 1990, Brody 1996). Individual differences in how siblings get along with each other are very marked from early infancy; siblings’ feelings range from extreme hostility and rivalry to affection and support, are often ambivalent, and are expressed uninhibitedly. The relationship is also notable for its intimacy: siblings know each other very well, and this can be a source of both support and conflict. These characteristics increase the potential of the relationship for developmental influence. Because of the emotional intensity and familiarity of their relationships, the study of siblings provides an illuminating window on what children understand about each other, which has challenged and informed 14063
Sibling Relationships, Psychology of conceptions of the development of early social understanding (Dunn 1992). 2.2 Deelopmental Changes in Sibling Relationships As children’s powers of understanding and communication develop, their sibling relationships change. Younger siblings play an increasingly active role in the relationship during the preschool years, and their older siblings take increasing interest in them. During middle childhood their relationships become more egalitarian; there is disagreement about how far this reflects an increase in the power of younger siblings over older siblings, or a decrease in the dominance both older and younger try to exert. A decrease in warmth between siblings during adolescence parallels the patterns of change found in the parent–child relationship as adolescents become increasingly involved with peers outside the family (Boer and Dunn 1990). There is a paucity of studies of siblings in adulthood; however, what information we have indicates that in the USA, most siblings maintain contact, communicate and share experiences until very late in life (Cicirelli 1996). During middle age, most adults describe feelings of closeness, rather than rivalry with their siblings, even when they have reduced contact, though family crises can evoke sibling conflict. Closeness and companionship become increasingly evident among older adults. Step- and half-siblings also continue to keep contact with each other, though they see each other less often than full siblings. Relationships with sisters appear to be particularly important in old age; this is generally attributed to women’s emotional expressiveness and their traditional roles as nurturers. Little systematic research has focused on ethnic differences; however, in a national sample in the USA, sibling relationships in African-American, Hispanic, non-Hispanic whites, and Asian-American adult respondents were compared; the conclusion was that the similarities across groups in contact and social support far outweighed the differences (Riedmann and White 1966). Siblings play a central role in adults’ lives in many other cultures; the psychology of these relationships remains to be studied systematically. 2.3 Continuities in Indiidual Differences in Relationships Do the marked individual differences in the affection or hostility that siblings show toward each other in early childhood show stability through middle childhood and adolescence? Research described in Brody (1996) indicates there is some continuity over time for both the positive and negative aspects of sibling relations, but there is also evidence for change. Contributors to change included new friendships children formed during the school years, which led to 14064
loss of warmth in the sibling relationship, or increased jealousy; developmental changes in the individuals; and life events (the majority of which contributed to increased intimacy and warmth, an exception being divorce or separation of parents). Research on adult siblings shows the relationship is not a static one during adulthood: life events, employment changes, etc. affect adult siblings’ relations—some increasing intimacy and contact, others with negative effects (Cicirelli 1996).
2.4 Influences on Indiidual Differences The temperament of both individuals in a sibling dyad is important in relation to conflict between them, and the match in their temperaments, too (Brody 1996). The effects of gender and age gap are less clear: Findings are inconsistent for young siblings, while research on children in middle childhood indicates that these ‘family constellation’ effects influence the relationship in complex ways; gender and socioeconomic status apparently increase in importance as influences on the sibling relationship during adolescence. Most striking are the links between the quality of sibling relationships and other relationships within the family—the parents’ relationships with each child, and the mother–partner or marital relationship. These connections are considered next.
3. Siblings, Parents, and Peers: Connections Between Relationships To what extent and in what ways do parent–child relationships influence sibling relationships? Currently there is much debate and some inconsistency in the research findings. First, there is some evidence that the security of children’s attachment to their parents is correlated with later sibling relationship quality, and that positive parent–child relations are correlated with positive sibling relations. Note that conclusions cannot be drawn about the direction of causal influence from these correlational data. There are also data that fit with a ‘compensatory’ model, in which intense supportive sibling relationships develop in families in which parents are uninvolved. This pattern may be characteristic of families at the extremes of stress and relationship difficulties. More consistent is the evidence that differential parent–child relationships (in which more attention and affection, and less punishment is shown by a parent toward one sibling than another) are associated with more conflicted, hostile sibling relationships (Hetherington et al. 1994). These links are especially clear for families under stress, such as those who have experienced divorce, and in those with disabled or sick siblings. Again, note that the evidence is correlational.
Sibling Relationships, Psychology of Indirect links between parent–child and sibling relationships have been documented in studies of the arrival of a sibling; a number of different processes are implicated ranging from general emotional disturbance, through processes of increasing cognitive complexity—such as increased positivity between siblings in families in which mothers talked to the firstborn about the feelings and needs of the baby in early months. This evidence indicates that even with young siblings, processes of attribution and reflection may be implicated in the quality of their developing relationship. The quality of the marital relationship is also linked to sibling relationships: mother–partner hostility is associated with increased negativity between siblings, while affection between adult partners is associated with positivity between siblings. Both direct pathways between marital and sibling relationships, and indirect pathways (via parent–child relationships) are implicated (Dunn et al. 1999).
early and middle childhood. Direction of effects in these associations is not clear: children who are good at understanding feelings and others’ minds are more effective cooperative play companions, thus their early sophistication in social understanding may contribute to the development of cooperative play, which itself fosters further social understanding (Dunn 1992). Other aspects of prosocial, cooperative behavior, pretend play, and conflict management have all been reported to be associated with the experience of friendly sibling interactions. While it appears plausible that experiences with siblings should ‘generalize’ to children’s relationships with peers, the story appears more complex, and there is not consistent evidence for simple positive links. The emotional dynamics and demands of sibling and friend relationships are very different.
4. Deelopmental Influence
Striking differences between siblings in personality, adjustment, and psychopathology have been documented in a wide range of studies. These present a challenge to conventional views of family influence, as the children are growing up within the same family. Extensive studies by behavior geneticists have now shown that the sources of environmental influence that make individuals different from one another work within rather than between families. The contribution of sibling studies to our understanding of the relative roles of genetics and environment in studies of socialization and development has been notable: the message here is not that family influence is unimportant, but that we need to document those experiences that are specific to each child in the family, and need to study more than one child in the family if we are to clarify what are the salient environmental influences on development (Dunn and Plomin 1990, Hetherington et al. 1994). The developmental research that documents the extreme sensitivity with which children monitor the interaction between other family members, and the significance of differential parent– child relationships combine here to clarify the social processes that are implicated in the development of differences between children growing up in the ‘same’ family.
4.1 Influence on Adjustment Three adjustment outcome areas in which siblings are likely to exert influence are aggressive and externalizing behavior, internalizing problems, and self esteem. For example, hostile aggressive sibling behavior is correlated with increasing aggressive behavior by the other sibling. Patterson and his group have shown the shaping role that siblings play in this pattern (Patterson 1986), for both clinic and community samples. The arrival of a sibling is consistently found to be linked to increased problems: disturbance in bodily functions, withdrawal, aggressiveness, dependency, and anxiety. It is thought these changes are linked to parallel changes in interaction between the ‘displaced’ older siblings and the parents. A growing literature links sibling influence to deviant behavior in adolescence. For example, frequent and problematic drinking by siblings increases adolescents’ tendency to drink: siblings appear to have both a direct effect and an indirect later effect on other sibs’ risks of becoming a drinker—through adolescents’ selection of peers who drink. Siblings can also be an important source of support in times of stress, and act as therapists for siblings with some problems, such as eating disorders (Boer and Dunn 1990). 4.2 Influence on Social Understanding The kinds of experience children have with their siblings are related to key aspects of their sociocognitive development. For instance, positive cooperative experiences with older siblings are correlated with the development of greater powers of understanding emotion and others’ mental states, both in
5. Why Are Siblings So Different From One Another?
6. Methodological Issues For the study of siblings in childhood, a combination of naturalistic and structured observations, interviews with parents and siblings have proved most useful. Children are articulate and forthcoming about their relations with their siblings, and cope with interviews and questionnaires from an early age. As in the study of any relationship, it is important to get both participants’ views on their relationship as these may 14065
Sibling Relationships, Psychology of differ, and both are valid. Cicirelli (1996) points out some particular methodological problems with studying siblings in adults: incomplete data sets formed when one sibling is unwilling or unable to participate, choices over which dyad is picked, and limitations in sample representativeness.
7. Future Directions Exciting future directions in sibling research include: clarification of the links between family relationships (including those involving step-relations); identification of the processes of family influence—for which we need studies with more than one child per family; further investigation of the role of genetics in individual development; and further exploration of significant sibling experiences for social understanding in middle childhood and adolescence. See also: Family Processes; Genetic Studies of Personality; Infancy and Childhood: Emotional Development; Sibling-order Effects; Social Relationships in Adulthood
Bibliography Boer F, Dunn J 1990 Children’s sibling relationships: Deelopmental and clinical issues. Lawrence Erlbaum Associates, Mahwah, NJ Brody G H 1996 Sibling relationships their causes and consequences. Ablex, Norwood, NJ Cicirelli V G 1996 Sibling relationships in middle and old age. In: Brody G H (ed.) Sibling relationships: Their causes and consequences. Ablex, Norwood, NJ, pp. 47–73 Dunn J 1992 Siblings and development. Current Directions in Psychological Science 1: 6–9 Dunn J, Deater-Deckard K, Pickering K, Beveridge M and the ALSPAC study team 1999 Siblings, parents and partners: Family relationships within a longitudinal community study. Journal of Child Psychology and Psychiatry 40: 1025–37 Dunn J, Plomin R 1990 Separate Lies: Why Siblings are so Different. Basic Books, New York Hetherington E M, Reiss D, Plomin R 1994 Separate Social Worlds of Siblings: The Impact of Nonshared Enironment on Deelopment. Lawrence Erlbaum Associates, Mahwah, NJ Koch H L 1954 The relation of ‘primary mental abilities’ in fiveand six-year-olds to sex of child and characteristics of his sibling. Child Deelopment 25: 209–23 Patterson G R 1986 The contribution of siblings to training for fighting: A microsocial analysis. In: Olweus D, Block J, Radke-Yarrow M (eds.) Deelopment of Antisocial and Prosocial Behaior: Research, Theories, and Issues. Academic Press, Orlando, FL, pp. 235–61 Riedmann A, White L 1966 Adult sibling relationships: Racial and ethnic comparisons. In: Brody G H et al. (ed.) Sibling Relationships: Their Causes and Consequences: Adances in Applied Deelopmental Psychology. Ablex, Norwood, NJ, pp. 105–26
J. Dunn 14066
Sign Language Until recently, most of the scientific understanding of the human capacity for language has come from the study of spoken languages. It has been assumed that the organizational properties of language are connected inseparably with the sounds of speech, and the fact that language is normally spoken and heard determines the basic principles of grammar, as well as the organization of the brain for language. There is good evidence that structures involved in breathing and chewing have evolved into a versatile and efficient system for the production of sounds in humans. Studies of brain organization indicate that the left cerebral hemisphere is specialized for processing linguistic information in the auditory–vocal mode and that the major language-mediating areas of the brain are connected intimately with the auditory–vocal channel. It has even been argued that hearing and the development of speech are necessary precursors to this cerebral specialization for language. Thus, the link between biology and linguistic behavior has been identified with the particular sensory modality in which language has developed. Although the path of human evolution has been in conjunction with the thousands of spoken languages the world over, recent research into signed languages has revealed the existence of primary linguistic systems that have developed naturally independent of spoken languages in a visual–manual mode. American Sign Language (ASL), for example, a sign language passed down from one generation to the next of deaf people, has all the complexity of spoken languages, and is as capable of expressing science, poetry, wit, historical change, and infinite nuances of meaning as are spoken languages. Importantly, ASL and other signed languages are not derived from the spoken language of the surrounding community: rather, they are autonomous languages with their own grammatical form and meaning. Although it was thought originally that signed languages were universal pantomime, or broken forms of spoken language on the hands, or loose collections of vague gestures, now scientists around the world have found that there are signed languages that spring up wherever there are communities and generations of deaf people (Klima and Bellugi 1988). One can now specify the ways in which the formal properties of languages are shaped by their modalities of expression, sifting properties peculiar to a particular language mode from more general properties common to all languages. ASL, for example, exhibits formal structuring at the same levels as spoken languages (the internal structure of lexical units and the grammatical scaffolding underlying sentences) as well as the same kinds of organizational principles as spoken languages. Yet the form this grammatical structuring assumes in a visual–manual language is apparently
Sign Language deeply influenced by the modality in which the language is cast (Bellugi 1980). The existence of signed languages allows us to enquire about the determinants of language organization from a different perspective. What would language be like if its transmission were not based on the vocal tract and the ear? How is language organized when it is based instead on the hands moving in space and the eyes? Do these transmission channel differences result in any deeper differences? It is now clear that there are many different signed languages arising independently of one another and of spoken languages. At the core, spoken and signed languages are essentially the same in terms of organizational principles and rule systems. Nevertheless, on the surface, signed and spoken languages differ markedly. ASL and other signed languages display complex linguistic structure, but unlike spoken languages, convey much of their structure by manipulating spatial relations, making use of spatial contrasts at all linguistic levels (Bellugi et al. 1989).
with different orderings producing different hierarchically organized meanings. Similarly, the syntactic structure specifying relations of signs to one another in sentences of ASL is also essentially organized spatially. Nominal signs may be associated with abstract points in a plane of signing space, and it is the direction of movement of verb signs between such endpoints that marks grammatical relations. Whereas in English, the sentences ‘the cat bites the dog’ and ‘the dog bites the cat’ are differentiated by the order of the words, in ASL these differences are signaled by the movement of the verb between points associated with the signs for cat and dog in a plane of signing space. Pronominal signs directed toward such previously established points or loci clearly function to refer back to nominals, even with many signs intervening (see Fig. 2). This spatial organization underlying syntax is a unique property of visual-gestural systems (Bellugi et al. 1989).
1. The Structure of Sign Language
2. The Acquisition of Sign Language by Deaf Children of Deaf Parents
As already noted, the most striking surface difference between signed and spoken languages is the reliance on spatial contrasts, most evident in the grammar of the language. At the lexical level, signs can be separated from one another minimally by manual parameters (handshape, movement, location). The signs for summer, ugly, and dry are just the same in terms of handshape and movement, and differ only in the spatial location of the signs (forehead, nose, or chin). Instead of relying on linear order for grammatical morphology, as in English (act, acting, acted, acts), ASL grammatical processes nest sign stems in spatial patterns of considerable complexity (see Fig. 1), marking grammatical functions such as number, aspect, and person spatially. Grammatically complex forms can be nested spatially, one inside the other,
Findings revealing the special grammatical structuring of a language in a visual mode lead to questions about the acquisition of sign language, its effect on nonlanguage visuospatial cognition, and its representation in the brain. Despite the dramatic surface differences between spoken and signed languages—simultaneity and nesting sign stems in complex co-occurring spatial patterns—the acquisition of sign language by deaf children of deaf parents shows a remarkable similarity to that of hearing children learning a spoken language. Grammatical processes in ASL are acquired at the same rate and by the same processing by deaf children as are grammatical processes by hearing children learning English, as if there were a biological timetable underlying language acquisition.
Three Dimensional Morphology (Sign Stem Embedded in Spatial Patterns)
a) GIVE (uninflected)
b) GIVE [Durational] "give continuously"
c) GIVE [Exhaustive] "give to each"
d) GIVE [[Exhaustive] Durational] 'give to each, that action recurring over time'
e) GIVE [[Durational] Exhaustive] 'give continuously to each in turn'
f) GIVE [[Durational] Exhaustive] 'give continuously to each in turn, that action recurring over time'
Figure 1 Spatially organized (three-dimensional) morphology in ASL
14067
Sign Language
Figure 2 Spatialized syntax
Deaf Children's Grammatical Overregularizations
*BED
[N:dual]
BED
*DUCK
[N:dual]
DUCK
*FUN
[N:dual]
FUN
Figure 3 Deaf children’s grammatical overregularizations
First words and first signs appear at around 12 months; combining two words or two signs in children, whether deaf or hearing, occurs by about 18–20 months, and the rapid expansion signaling the development of grammar (morphology and syntax) develops in both spoken and signed languages by about 3–3" years. Just as hearing children show their # discovery of grammatical regularities by mastery and producing overregularizations (‘I goed there,’ ‘we helded the rabbit’), deaf children learning sign language show evidence of learning the spatial morphology signaling plural forms and aspectual forms by producing overregularizations in spatial patterns (see Fig. 3).
3. Neural Systems Subsering Visuospatial Languages The differences between signed and spoken languages provide an especially powerful tool for understanding the neural systems subserving language. Consider the following: In hearing\speaking individuals, language 14068
processing is mediated generally by the left cerebral hemisphere, whereas visuospatial processing is mediated by the right cerebral hemisphere. But what about a language that is communicated using spatial contrasts rather than temporal contrasts? On the one hand, the fact that sign language has the same kind of complex linguistic structure as spoken languages and the same expressivity might lead one to expect left hemisphere mediation. On the other hand, the spatial medium so central to the linguistic structure of sign language clearly suggests right hemisphere or bilateral mediation. In fact, the answer to this question is dependent on the answer to another, deeper, question concerning the basis of the left hemisphere specialization for language. Specifically, is the left hemisphere specialized for language processing per se (i.e., is there a brain basis for language as an independent entity)? Or is the left hemisphere’s dominance generalized to process any type of information that is presented in terms of temporal contrasts? If the left hemisphere is indeed specialized for processing language itself, sign language processing should be mediated by the left
Sign Language hemisphere, as is spoken language. If, however, the left hemisphere is specialized for processing fast temporal contrasts in general, one would expect sign language processing to be mediated by the right hemisphere. The study of sign languages in deaf signers permits us to pit the nature of the signal (auditory-temporal vs. visual-spatial) against the type of information (linguistic vs. nonlinguistic) that is encoded in that signal as a means of examining the neurobiological basis of language (Poizner et al. 1990). One program of studies examines deaf lifelong signers with focal lesions to the left or the right cerebral hemisphere. Major areas, each focusing on a special property of the visual-gestural modality as it bears on the investigation of brain organization for language, are investigated. There are intensive studies of large groups of deaf signers with left or right hemisphere focal lesions in one program (Salk); all are highly skilled ASL signers, and all used sign as a primary form of communication throughout their lives. Individuals were examined with an extensive battery of experimental probes, including formal testing of ASL at all structural levels; spatial cognitive probes sensitive to right-hemisphere damage in hearing people; and new methods of brain imaging, including structural and functional magnetic resonance imaging (MRI, fMRI), event-related potentials (ERP), and positron emission tomography (PET). This large pool of well-studied and thoroughly characterized subjects, together with new methods of brain imaging and sensitive tests of signed as well as spoken language allows for a new perspective on the determinants of brain organization for language (Hickok and Bellugi 2000, Hickok et al. 1996).
3.1 Left Hemisphere Lesions and Sign language Grammar The first major finding is that so far only deaf signers with damage to the left hemisphere show sign language aphasias. Marked impairment in sign language after left hemisphere lesions was found in the majority of the left hemisphere damaged (LHD) signers, but not in any of the right hemisphere damaged (RHD) signers, whose language profiles were much like matched controls. Figure 4 presents a comparison of LHD, RHD, and normal control profiles of sign characteristics from The Salk Sign Diagnostic Aphasia Examination—a measure of sign aphasia. The RHD signers showed no impairment at all in any aspect of ASL grammar; their signing was rich, complex, and without deficit, even in the spatial organization underlying sentences of ASL. By contrast, signers with LHD showed markedly contrasting profiles: one was agrammatic after her stroke, producing only nouns and a few verbs with none of the grammatical apparatus of ASL, another made frequent paraphasias at the sign internal level, and several showed many
grammatical paraphasias, including neologisms, particularly in morphology. Another deaf signer showed deficits in the spatially encoded grammatical operations which link signs in sentences, a remarkable failure in the spatially organized syntax of the language. Still another deaf signer with focal lesions to the left hemisphere reveal dissociations not found in spoken language: a dissociation between sign and gesture, with a cleavage between capacities for sign language (severely impaired) and manual pantomime (spared). In contrast, none of the RHD signers showed any within-sentence deficits; they were completely unimpaired in sign sentences and not one showed aphasia for sign language (in contrast to their marked nonlanguage spatial deficits, described below) (Hickok and Bellugi 2000, Hickok et al. 1998). Moreover, there are dramatic differences in performance between left- and right-hemisphere damaged signers in formal experimental probes of sign competence. For example, a test of the equivalent of rhyming in ASL provides a probe of phonological processing. Two signs ‘rhyme’ if they are similar in all but one phonological parametric value such as handshape, location, or movement. To tap this aspect of phonological processing, subjects are presented with an array of pictured objects and asked to pick out the two objects whose signs rhyme (Fig. 5). The ASL signs for key and apple share the same handshape and movement, and differ only in location, and thus are like rhymed pairs. LHD signers are significantly impaired relative to RHD signers and controls on this test, another sign of the marked difference in effects of right- and left-hemisphere lesions on signing. On other tests of ASL processing at different structural levels, there are similar distinctions between left- and rightlesioned signers, with the right-lesioned signers much like the controls, but the signers with left hemisphere lesions significantly impaired in language processing. Moreover, studies have found that there can be differential breakdown of linguistic components of sign language (lexicon and grammar) with different left hemisphere lesions.
3.2 Right Hemisphere Lesions and Nonlanguage Spatial Processing These results from language testing contrast sharply with results on tests of nonlanguage spatial cognition. RHD signers are significantly more impaired on a wide range of spatial cognitive tasks than LHD signers, who show little impairment. Drawings of many of the RHD signers (but not those with LHD) show severe spatial distortions, neglect of the left side of space, and lack of perspective. RHD deaf signers show lack of perspective, left neglect, and spatial disorganization on an array of spatial cognitive nonlanguage tests (block design, drawing, hierarchical 14069
Sign Language
Left Hemisphere Lesions lead to Sign Language Aphasias LHD Signers Sign Profiles
RATING SCALE PROFILE OF SIGN CHARACTERISTICS 1
2
3
4
5
6
7
MELODIC LINE Absent
limited to short phrases and stereotyped expressions
runs through entire sentence
4 signs
7 signs
normal only in familiar signs and phrases
never impaired
PHRASE LENGTH 1 sign
ARTICULATORY AGILITY always impaired or impossible
Control Dear Signers
RATING SCALE PROFILE OF SIGN CHARACTERISTICS 1
2
Absent
4
5
6
7
limited to short phrases and stereotyped expressions
runs through entire sentence
PHRASE LENGTH
utterance
limited to simple declaratives and stereotypes
normal range
once per minute of conversation
absent
ARTICULATORY AGILITY normal only in familiar signs and phrases
never impaired
information proportional to fluency
exclusively content signs
utterance
limited to simple declaratives and stereotypes
absent
6 7 runs through entire sentence
7 signs
normal only in familiar signs and phrases
never impaired
none available
PARAPHASIA IN RUNNING SIGNpresent in every
limited to simple declaratives and stereotypes
normal range
once per minute of conversation
information proportional to fluency
fluent without information
exclusively content signs
absent
information proportional to fluency
exclusively content signs
SIGN COMPREHENSION (z = -2) (z = -1.5)
(z = +1)
5
4 signs
utterance
Absent
Normal (z = -.5) (z = 0) (z = +.5)
4
1 sign
always impaired or impossible
normal range
once per minute of conversation
SIGN COMPREHENSION Absent (z = -2) (z = -1.5) (z = -1)
3
limited to short phrases and stereotyped expressions
SIGN FINDING
fluent without information
SIGN COMPREHENSION
2
Absent
GRAMMATICAL FORM
none available
SIGN FINDING fluent without information
1 MELODIC LINE
ARTICULATORY AGILITY
PARAPHASIA IN RUNNING SIGNpresent in every
SIGN FINDING
RATING SCALE PROFILE OF SIGN CHARACTERISTICS
7 signs
GRAMMATICAL FORM
none available
PARAPHASIA IN RUNNING SIGN present in every
RHD Signers
PHRASE LENGTH 4 signs
1 sign
always impaired or impossible
GRAMMATICAL FORM
3
MELODIC LINE
Absent
Normal (z = -1)
(z = -.5)(z = 0)
(z = -2) (z = -1.5) (z = -1) (z = -.5)
(z = +.5) (z = +1)
Normal (z = 0)
(z = +.5)
(z = +1)
Figure 4 Left hemisphere lesions lead to sign language aphasias
Rhyming Task with LHD and RHD Signers Sample Item on ASL Rhyming Test
100
%Correct
80 60 40 20 0
LHD Signers
RHD Signers
Figure 5 Rhyming task with LHD and RHD signers
processing), compared with LHD deaf signers. Yet, astonishingly, these severe spatial deficits among RHD signers do not affect their competence in a spatially nested language, ASL. The case of a signer with a right parietal lesion leading to severe left neglect is of special interest: Whereas his drawings show characteristic omissions on the left side of space, his signing (including the spatially organized syntax) is impeccable, with signs and spatially organized syntax perfectly maintained. The finding that sign aphasia follows left hemisphere lesions but not right hemisphere lesions provides a strong case for a modality-independent linguistic basis for the left hemisphere specialization for language. These data suggest that the left hemisphere is predisposed biologically for language, independent of language modality. Thus, hearing and speech are not necessary for the development of hemisphere specialization—sound is not crucial. Furthermore, the finding of a dissociation between competence in a spatial language and competence in nonlinguistic spatial cognition demonstrates that it is the type of information that is encoded in a signal (i.e., linguistic vs. spatial information) rather than the nature of the signal itself (i.e., spatial vs. temporal) that determines 14070
the organization of the brain for higher cognitive functions.
4. Language, Modality, and the Brain Analysis of patterns of breakdown in deaf signers provides new perspectives on the determinants of hemispheric specialization for language. First, the data show that hearing and speech are not necessary for the development of hemispheric specialization: sound is not crucial. Second, it is the left hemisphere that is dominant for sign language. Deaf signers with damage to the left hemisphere show marked sign language deficits but relatively intact capacity for processing nonlanguage visuospatial relations. Signers with damage to the right hemisphere show the reverse pattern. Thus, not only is there left hemisphere specialization for language functioning, there is also complementary specialization for nonlanguage spatial functioning. The fact that grammatical information in sign language is conveyed via spatial manipulation does not alter this complementary specialization. Furthermore, components of sign language (lexicon and grammar) can be selectively impaired, reflecting
Sign Language: Psychological and Neural Aspects differential breakdown of sign language along linguistically relevant lines. These data suggest that the left hemisphere in humans may have an innate predisposition for language, regardless of the modality. Since sign language involves an interplay between visuospatial and linguistic relations, studies of sign language breakdown in deaf signers may, in the long run, bring us closer to the fundamental principles underlying hemispheric specialization.
velopmental trajectory similar to spoken languages. Memory for signs exhibits patterns of interference and forgetting that are similar to those found for speech. Early use of sign language may enhance certain aspects of nonlanguage visual perception. Neuropsychological studies show that left hemisphere regions are important in both spoken and sign language processing.
See also: Evolution and Language: Overview; Language and Animal Competencies; Sign Language: Psychological and Neural Aspects
1. Linguistic Principles of American Sign Language 1.1 Language and Deaf Culture
Bibliography Bellugi U 1980 The structuring of language: Clues from the similarities between signed and spoken language. In: Bellugi U, Studdert-Kennedy M (eds.) Signed and Spoken Language: Biological Constraints on Linguistic Form. Dahlem Konferenzen. Weinheim\Deerfield Beach, FL, pp. 115–40 Bellugi U, Poizner H, Klima E S 1989 Language, modality and the brain. Trends in Neurosciences 10: 380–8 Emmorey K, Kosslyn S M, Bellugi U 1993 Visual imagery and visual-spatial language: Enhanced imagery abilities in deaf and hearing ASL signers. Cognition 46: 139–81 Hickok G, Bellugi U 2000 The signs of aphasia. In: Boller F, Grafman J (eds.) Handbook of Neuropsychology, 2nd edn. Elsevier Science Publishers, Amsterdam, The Netherlands Hickok G, Bellugi U, Klima E S 1996 The neurobiology of signed language and its implications for the neural organization of language. Nature 381: 699–702 Hickok G, Bellugi U, Klima E S 1998 The basis of the neural organization for language: Evidence from sign language aphasia. Reiews in the Neurosciences 8: 205–22 Klima E S, Bellugi U 1988 The Signs of Language. Harvard University Press, Cambridge, MA Poizner H, Klima E S, Bellugi U 1990 What the Hands Reeal About the Brain. MIT Press\Bradford Books, Cambridge, MA
U. Bellugi
Sign Language: Psychological and Neural Aspects Signed languages of the deaf are naturally occurring human languages. The existence of languages expressed in different modalities (i.e., oral–aural, manual– visual) provides a unique opportunity to explore and distinguish those properties shared by all human languages from those that arise in response to the modality in which the language is expressed. Despite the differences in language form, signed languages possess formal linguistic properties found in spoken languages. Sign language acquisition follows a de-
Sign languages are naturally-occurring manual languages that arise in communities of deaf individuals. These manual communication systems are fully expressive, systematic human languages and are not merely conventionalized systems of pantomime nor manual codifications of a spoken language. Many types of deafness are inheritable and it is not unusual to find isolated communities of deaf individuals who have developed complex manual languages (Groce 1985). The term Deaf Community has been used to describe a sociolinguistic entity that plays a crucial role in a deaf person’s exposure to and acceptance of sign language (Padden and Humphries 1988). American Sign Language (ASL), used by members of the Deaf community in the USA and Canada, is only one of many sign languages of the world, but it is the one that has been studied most extensively . 1.2 Structure of American Sign Language A sign consists of a hand configuration that travels about a movement path and is directed to or about a specific body location. Sign languages differ from one another in the inventories and compositions of hand-shapes, movements, and locations used to signal linguistic contrasts just as spoken language differ from one another in the selection of sounds used, and how those sounds are organized into words. Many sign languages incorporate a subsystem in which orthographic symbols used in the surrounding spoken language communities are represented manually on the hands. One example is the American English manual alphabet, which is produced on one hand and allows users of ASL to represent English lexical items. All human languages, whether spoken or signed, exhibit levels of structure which govern the composition and formation of word forms and specify how words combine into sentences. In formal linguistic analyses, these structural levels are referred to as phonology, morphology and the syntax of the language. In this context, phonological organization refers to the patterning of the abstract formational 14071
Sign Language: Psychological and Neural Aspects units of a natural language (Coulter and Anderson 1993). Compared to spoken languages in which contrastive units (for example, phonemes) are largely arranged in a linear fashion, signed languages exhibit simultaneous layering of information. For example, in a sign, a handshape will co-occur with, rather than follow sequentially, a distinct movement pattern. ASL exhibits complex morphology. Morphological markings in ASL are expressed as dynamic movement patterns overlaid on a more basic sign form. These nested morphological forms stand in contrast to the linear suffixation common in spoken languages (Klima and Bellugi 1979). The prevalence of simultaneous layering of phonological content and the nested morphological devices observed across many different sign languages likely reflect an influence of modality on the realization of linguistic structure. Thus the shape of human languages reflects the constraints and affordances imposed by the articulator systems involved in transmission of the signal (i.e., oral versus manual) and the receptive mechanisms for decoding the signal (i.e., auditory versus visual). A unique property of ASL linguistic structure is the reliance upon visuospatial mechanisms to signal linguistic contrasts and relations. One example concerns the use of facial expressions in ASL. In ASL, certain syntactic and adverbial constructions are marked by specific and obligatory facial expressions (Liddell 1980). These linguistic facial expressions differ significantly in appearance and execution of affective facial expressions. A second example concerns the use of inflectional morphology to express subject and object relationships. At the syntactic level, nominals introduced into the discourse are assigned arbitrary reference points along a horizontal plane in the signing space. Signs with pronominal function are directed toward these points, and the class of verbs which require subject\object agreement obligatorily move between these points (Lillo-Martin and Klima 1990). Thus, whereas many spoken languages represent grammatical functions through case marking or linear ordering, in ASL grammatical function is expressed through spatial mechanisms. This same system of spatial reference, when used across sentences, serves as a means of discourse cohesion (Winston 1995).
American Sign Language, though questions remain as to whether manual babbling is constrained by predominantly motoric or linguistic factors (Cheek et al. 2001, Petitto and Marantette 1991). Between 10 and 12 months of age, children reared in signing homes begin to produce their first signs, with two-sign combinations appearing at approximately 18 months. Some research has suggested a precocious early vocabulary development in signing children. However, these reports are likely to be a reflection of parents’ and experimenters’ earlier recognition of signs compared to words rather than underlying differences in development of symbolic capacities of signing and speaking children. Signing infants produce the same range of grammatical errors in signing that have been documented for spoken language, including phonological substitutions, morphological overregularizations, and anaphoric referencing confusions (Petitto 2000). For example, in the phonological domain a deaf child will often use only a subset of the handshapes found in the adult inventory, opting for a simpler set of ‘unmarked’ handshapes. This is similar to the restricted range of consonant usage common in children acquiring a spoken language.
2. Psycholinguistic Aspects of Sign Language Psycholinguistic studies of American Sign Language have examined how the different signaling characteristics of the language impact transmission and recognition. Signs take longer to produce than comparable word forms. The average number of words per second in running speech is about 4 to 5, compared with 2 to 3 signs per second in fluent signing. However, despite differences in word transmission rate, the proposition rate for speech and sign is the same, roughly one proposition every 1 to 2 seconds. Compare, for example the English phrase ‘I have already been to California’ with the ASL equivalent, which can be succinctly signed using three monosyllabic signs, glossed as FINISH TOUCH CALIFORNIA.
2.1 Sign Recognition 1.3 Sign Language Acquisition Children exposed to signed languages from birth acquire these languages on a similar maturational timetable as children exposed to spoken languages (Meier 1991). Prelinguistic infants, whether normally hearing or deaf, engage in vocal play commonly known as babbling. Recent research has shown that prelinguistic gestural play referred to as manual babbling will accompany vocal babbling. There appear to be significant continuities between prelinguistic gesture and early signs in deaf children exposed to 14072
Studies of sign recognition have examined how signs are recognized in time. Recognition appears to follow a systematic pattern in which information about the location of the sign is reliably identified first, followed by handshape information and finally the movement. Identification of the movement reliably leads to the identification of the sign. Interestingly, despite the slower articulation of a sign compared to a word, sign recognition appears to be faster than word recognition. Specifically, it has been observed that proportionally less of the signal needs to be processed in order to uniquely identify a sign compared to a spoken
Sign Language: Psychological and Neural Aspects word. For example, one study reports that only 240 msec. or 35 percent of a sign has to be seen before a sign is identified (Emmory and Corina 1990). In comparable studies of spoken English, Grosjean (1980) reports that 330 msec. or 83 percent of a word has to be heard before a word can be identified. This finding is due, in part, to the simultaneous patterning of phonological information in signs compared to the more linear patterning of phonological information characteristic of spoken languages. These structural differences in turn have implications for the organization of lexical neighborhoods. Spoken languages, such as English, may have many words that share in their initial phonological structures (tram, tramp, trampoline, etc.) leading to greater coactivation (and thus longer processing time) during word recognition. In contrast, ASL sign forms are often formationally quite distinct from one another, permitting quicker selection of a lexical unique entry.
2.2 Memory for Signs Efforts to explore how the demands of sign language processing influence memory and attention have led to several significant findings. Early studies of memory for lists of signs report classic patterns of forgetting and interference, including serial position effects of primacy and recency (i.e., signs at the beginning and the end of a list are better remembered than items in the middle). Likewise, when sign lists are composed of phonologically similar sign forms, signers exhibit poorer recall (Klima and Bellugi 1979). This result is similar to what has been reported for users of spoken language, where subjects exhibit poorer memory for lists of similarly sounding words (Conrad and Hull 1964). These findings indicate that signs, like words, are encoded into short-term memory in a phonological or articulatory code rather than in a semantic code. More recent work has assessed whether Baddeley’s (1986) working memory model pertains to sign language processing. This model includes components that encode linguistic and visuospatial information as well as a central executive which mediates between immediate and long-term memory. Given the spatial nature of the sign signal, the question of which working memory component(s) is engaged during sign language processing is particularly interesting. Wilson and Emmorey (2000) have shown word-length effects in signs; it is easier to maintain a cohort of short signs (monosyllabic signs) than long signs (polysyllabic signs) in working memory. They also report effects of ‘articulatory suppression.’ In these studies, requiring a signer to produce repetitive, sign-like hand movements while encoding and maintaining signs in memory has a detrimental effect on aspects of recall. Once again, analogous effects are known to exist for spoken languages and these phenomena provide support for a multicomponent model of linguistic working memory
that includes both an articulatory loop and a phonological store. Finally, under some circumstances deaf signers are able to utilize linguistically relevant spatial information to encode signs, suggesting the engagement of a visuospatial component of working memory. 2.3 Perception and Attention Recent experiments with native users of signed languages have shown that experience with a visual sign language may improve or alter visual perception of nonlanguage stimuli. For example, compared to hearing nonsigners, deaf signers have been shown to possess enhanced or altered perception along several visual dimensions such as motion processing (Bosworth and Dobkins 1999), mental rotation, and processing of facial features (Emmorey 2001). For example, the ability to detect and attend selectively to peripheral, but not central, targets in vision is enhanced in signers (Neville 1991). Consistent with this finding is evidence for increased functional connectivity in neural areas mediating peripheral visual motion in the deaf (Bavelier et al. 2000). As motion processing, mental rotation, and facial processing underlie aspects of sign language comprehension, these perceptual changes have been attributed to experience with sign language. Moreover, several of these studies have reported such enhancements in hearing signers raised in deaf signing households. These findings provide further evidence that these visual perception enhancements are related to the acquisition of a visual manual language, and are not due to compensatory mechanisms developed as a result of auditory deprivation.
3. Neural Organization of Signed Language 3.1 Hemispheric Specialization One of the most significant findings in neuropsychology is that the two cerebral hemispheres show complementary functional specialization, whereby the left hemisphere mediates language behaviors while the right hemisphere mediates visuospatial abilities. As noted, signed languages make significant use of visuospatial mechanisms to convey linguistic information. Thus sign languages exhibit properties for which each of the cerebral hemispheres shows specialization. Neuropsychological studies of brain injured deaf signers and functional imaging studies of brain intact signers have provided insights into the neural systems underlying sign language processing. 3.2 Aphasia in Signed Language Aphasia refers to acquired impairments in the use of language following damage to the perisylvian region 14073
Sign Language: Psychological and Neural Aspects of the left hemisphere. Neuropsychological case studies of deaf signers convincingly demonstrate that aphasia in signed language is also found after left hemisphere perisylvian damage (Poizner et al. 1987). Moreover, within the left hemisphere, production and comprehension impairments follow the well-established anterior versus posterior dichotomy. Studies have documented that frontal anterior damage leads to Broca-like sign aphasia. In these cases, normal fluent signing is reduced to effortful, single-sign, ‘telegraphic’ output with little morphological complexity (such as verb agreement). Comprehension, however, is left largely intact. Wernicke-like sign aphasia following damage to the posterior third of the perisylvian region presents with fluent but often semantically opaque output, and comprehension also suffers. In cases of left hemisphere damage, the occurrence of sign language paraphasias may be observed. Paraphasia may be formationally disordered (for example, substituting an incorrect hand shape) or semantically disordered (producing a word that is semantically related to the intended form). These investigations serve to underscore the importance of left hemisphere structures for the mediation of signed languages and illustrate that sign language breakdown is not haphazard, but rather honors linguist boundaries (Corina 2000). The effects of cortical damage to primary language areas that result in aphasic disturbance can be differentiated from more general impairments of movement. For example Parkinson’s disease leads to errors involving timing, scope, and precision of general movements including, but not limited to, those involved in speech and signing. These errors produce phonetic disruptions in signing, rather than higherlevel phonemic disruptions that are apparent in aphasic signing (Corina 1999). Apraxia is defined as an impairment of the execution of a learned movement (e.g., saluting, knowing how to work a knife and fork). Lesions associated with the left inferior parietal lobe result in an inability to perform and comprehend gestures (Gonzalez Rothi and Heilman 1997). Convincing dissociations of sign language impairment with well-preserved praxic abilities have been reported. In one case, a subject with marked sign language aphasia affecting both production and comprehension produced unencumbered pantomime. Moreover, both comprehension and production of pantomime were found to be better preserved than was sign language. These data indicate that language impairments following left hemisphere damage are not attributable to undifferentiated symbolic impairments and demonstrate that ASL is not simply an elaborate pantomimic system.
tion in nonlinguistic domains (e.g., face processing, drawing, block construction, route finding, etc.) but have reported only minimal language disruption. Problems in the organization of discourse have been observed in RHD signers, as have also been reported in users of spoken languages. More controversial are the influences of visuospatial impairment in the use of highly spatialized components of the language such as complex verb agreement and the classifier system. Further work is needed to understand whether these infrequently reported impairments (both in comprehension and production) reflect core linguistic deficits or rather reflect secondary effects of impaired visuospatial processing.
3.4 Functional Imaging Recent studies using functional brain imaging have explored the neural organization of language in users of signed languages. These studies have consistently found participation of classic left hemisphere perisylvian language areas in the mediation of sign language in profoundly deaf, lifelong signers. For example Positron Emission Tomography studies of production in British Sign Language (McGuire et al. 1997) and Langue des Signes Quebecoise (Petitto et al. 2000) reported deaf subjects activated left inferior frontal regions, regions similar to those that mediate speech in hearing subjects. Researchers using Functional Magnetic Resonance imaging techniques in investigating comprehension of ASL in deaf and hearing native users of signed language have shown significant activations in frontal opercular areas (including Broca’s area and dorsolateral prefrontal cortex) as well as in posterior temporal areas (such as Wernicke’s area and the angular gyrus) (Neville et al. 1998). Also reported was extensive activation in right hemisphere regions. Subsequent studies have conferred that aspects of the right posterior hemisphere activation appear to be unique to sign language processing and present only in signers who acquired sign language from birth (Newman et al. in press). Investigations of psychological and neural aspects of signing reveal strong commonalities in the development and cognitive processing of signed and spoken languages despite major differences in the surface form of these languages. Early exposure to a sign language may lead to the enhancements in the specific neural systems underlying visual processing. There appears to be a strong biological predisposition for left hemisphere structures in the mediation of language, regardless of the modality of expression.
3.3 Role of the Right Hemisphere in ASL Studies of signers with right hemisphere damage (RHD) have reported significant visuospatial disrup14074
See also: Language Acquisition; Language Development, Neural Basis of; Sign Language
Signal Detection Theory
Bibliography Baddeley A 1986 Working Memory. Oxford University Press, New York Bavelier D, Tomann A, Hutton C, Mitchell T, Corina D, Liu G, Neville H 2000 Visual attention at the periphery is enhanced in congenitally deaf individuals. Journal of Neuroscience 20: RC93 Bosworth R, Dobkins K 1999 Left-hemisphere dominance for motion processing in deaf signers. Psychological Science 10: 256–62 Cheek A, Cormier K, Repp A, Meier R 2001 Prelinguistic gesture predicts mastery and error in the production of early signs. Language Corina D 1999 Neural disorders of language and movement: Evidence from American Sign Language. In: Messing L, Campbell R (eds.) Gesture, Speech and Sign. Oxford University Press, New York Corina D 2000 Some observations regarding paraphasia in American Sign Language. In: Emmorey K, Lane H (eds.) The Signs of Language Reisited: An Anthology to Honor Ursula Bellugi and Edward Klima. Lawrence Erlbaum Associates, Mahwah, NJ Coulter G, Anderson S 1993 Introduction. In: Coulter G (ed.) Phonetics and Phonology: Current Issues in ASL Phonology. Academic Press, San Diego, CA Conrad R, Hull A 1964 Information, acoustic confusion and memory span. British Journal of Psychology 55: 429–32 Emmorey K 2001 Language, Cognition, and the Brain: Insights From Sign Language Research. Lawrence Erlbaum Associates, Mahwah, NJ Emmorey K, Corina D 1990 Lexical recognition in sign language: Effects of phonetic structure and morphology. Perceptual and Motor Skills 71: 1227–52 Gonzalez Rothi L, Heilman K 1997 Introduction to limb apraxia. In: Gonzalez Rothi L, Heilman K (eds.) Apraxia: The Neuropsychology of Action. Psychology, Hove, UK Grosjean F 1980 Spoken word recognition processes and the gating paradigm. Perception and Psychophysics 28: 267–83 Groce J 1985 Eeryone Here Spoke Sign Language. Harvard University Press, Cambridge, MA Klima E, Bellugi U 1979 The Signs of Language. Harvard University Press, Cambridge, MA Liddell S 1980 American Sign Language Syntax. Mouton, The Hague, The Netherlands Lillo-Martin D, Klima E 1990 Pointing out differences: ASL pronouns in syntactic theory. In: Fisher S, Siple P (eds.) Theoretical Issues in Sign Language Research I: Linguistics. University of Chicago Press, Chicago McGuire P, Robertson D, Thacker A, David A, Frackowiak R, Frith C 1997 Neural correlates of thinking in sign language. Neuroreport 8: 695–8 Meier R 1991 Language acquisition by deaf children. American Scientist 79: 60–70 Neville H 1991 Neurobiology of cognitive and language processing: Effects of early experience. In: Gibson K, Peterson A (eds.) Brain Maturation and Cognitie Deelopment: Comparatie and Cross-cultural Perspecties. Aldine de Gruyter Press, Hawthorne, NY Neville H, Bavelier D, Corina D, Rauschecker J, Karni A, Lalwani A, Braun A, Clark V, Jezzard P, Turner R 1998 Cerebral organization for language in deaf and hearing subjects: Biological constraints and effects of experience. Proceedings of the National Academy of Science 90: 922–9
Newman A, Corina D, Tomann A, Bavelier D, Jezzard P, BraunA, Clark V, Mitchell T, Neville H (submitted) Effects of age of acquisition on cortical organization for American Sign Language: an fMRI study. Nature Neuroscience Padden C, Humphries T 1988 Deaf in America: Voices from a Culture. Harvard University Press, Cambridge, MA Petitto L 2000 The acquisition of natural signed languages: Lessons in the nature of human language and its biological foundations. In: Chamberlain C, Morford J (eds.) Language Acquisition by Eye. Lawrence Erlbaum Associates, Mahwah, NJ Petitto L, Marantette P 1991 Babbling in the manual mode: Evidence for the ontogeny of language. Science 251: 1493–6 Petitto L, Zatorre R, Gauna K, Nikelski E, Dostie D, Evans A 2000 Speech-like cerebral activity in profoundly deaf people processing signed languages: Implications for the neural basis of human language. Proceedings of the National Academy of Science 97: 13961–6 Poizner H, Klima E, Bellugi U 1987 What the Hands Reeal About the Brain. MIT Press, Cambridge, MA Wilson M, Emmorey K 2000 When does modality matter? Evidence from ASL on the nature of working memory. In: Emmorey K, Lane H (eds.) The Signs of Language Reisited: An Anthology to Honor Ursula Bellugi and Edward Klima. Lawrence Erlbaum Associates, Mahwah, NJ Winston E 1995 Spatial mapping in comparative discourse frames. In: Emmorey K, Reilly J (eds.) Language, Gesture, and Space. Lawrence Erlbaum Associates, Mahwah, NJ
D. P. Corina
Signal Detection Theory Signal detection theory (SDT) is a framework for interpreting data from experiments in which accuracy is measured. It was developed in a military context (see Signal Detection Theory, History of), then applied to sensory studies of auditory and visual detection, and is now widely used in cognitive science, diagnostic medicine, and many other fields. The key tenets of SDT are that the internal representations of stimulus events include variability, and that perception (in an experiment, or in everyday life) incorporates a decision process. The theory characterizes both the representation and the decision rule.
1. Detection and Discrimination Experiments Consider an auditory experiment to determine the ability of a listener to detect a weak sound. On some trials, the sound is presented, and the listener is instructed to say ‘yes’ (I heard it) or ‘no’ (I did not hear it). On other trials, there is no sound, discouraging the observer from responding ‘yes’ indiscriminately. The presence of variability in sensory systems leads us to call these latter presentations noise (N ) trials, and the former signal plus noise (SjN ) trials. Possible data based on 50 trials of each type are given in Table 1. Responses of ‘yes’ on signal trials are called hits, and 14075
Signal Detection Theory Table 1 Response frequencies in an auditory detection experiment ‘Yes’
‘No’
Total
Signal + noise
42
8
50
Noise
15
35
50
alarm rate, so the criterion is about 0.5 standard deviations above the mean of that distribution.
2. Sensitiity and Response Bias Figure 1 offers natural interpretations of the sensitiity of the observer in the task and of the decision rule. Sensitivity, usually denoted dh, is reflected by the difference between the two distribution means; this characteristic of the representation is unaffected by the location of the criterion, and is thus free of response bias. The magnitude of dh is found easily by computing the distances between each mean and the criterion, found in the last section: dh l z(hit rate)kz( false-alarm rate)
(1)
where z( p) is the point on a unit-normal distribution above which the area p can be found. The function z is negative for arguments less than 0.5 and positive for arguments above that value. In the example, dh l z(0.84)kz(0.30) l 0.994k(k0.524) l 1.518
Figure 1 Distributions assumed by SDT to result from N and SjN. The horizontal axis is scaled in standard deviation units, and the difference between the means is dh. The observer responds ‘yes’ for values to the right of the criterion (vertical line), ‘no’ for those to the left
on no-signal trials are called false alarms. The observer’s performance can be summarized by a hit rate (here 0.84) and a false-alarm rate (here 0.30). The variability principle implies that repeated presentation of the signal leads to a distribution of internal effect, as does repeated presentation of no signal. In the most common SDT model, these distributions are assumed to be normal, and to differ only in mean, as shown in Fig. 1. How does the observer reach a decision in this experiment, when any observation on the ‘internal effect’ axis is ambiguous with regard to source? The best decision rule is to establish a criterion value c on the axis, and respond ‘yes’ for values to the right of c, ‘no’ for values to the left. The location of the criterion can be determined from the data, for the area above it under the SjN distribution must equal the hit rate. Consulting a table of the normal distribution reveals that in the example above the criterion must therefore be about one standard deviation below the mean of that distribution. Similarly, the area above the criterion under the N distribution must equal the false14076
Criterion location is one natural measure of response bias. If evaluated relative to the point at which the two distributions cross, it can be calculated by a formula analogous to Eqn. 1: c lk"[z(hit rate)jz( false-alarm rate)] #
(2)
In the example above c lk"[z(0.84)jz(0.30)] lk0.235, # that is, the criterion is slightly to the left of the crossover point, as in fact is shown in Fig. 1.
3. Receier Operating Characteristic (ROC) Cures If sensitivity truly is unaffected by the location of the response criterion, then a shift in that criterion should leave dh unchanged. An observer’s decision rule can be manipulated between experimental runs through instructions or payoffs. In the more efficient rating method, a rating response is required; the analysis assumes that each distinct rating corresponds to a different region along the axis of Fig. 1, and that each region is defined by a different criterion. The data are analyzed by calculating a hit and false-alarm rate for each criterion separately. A plot of hit rate vs. false alarm rate is called a receier-operating characteristic, or ROC; the ROC that would be produced from the
Signal Detection Theory veloped for complex experimental situations, such as source monitoring in memory; in this context, they are called multinomial models (see Discrete State Models of Information Processing).
5. Identification and Classification Along a Single Dimension
Figure 2 An ROC curve, the relation between hit and falsealarm rates, both of which increase as the criterion location moves (from right to left, in Fig. 1)
representation in Fig. 1 is shown in Fig. 2. Points at the lower end of the curve correspond to high criteria, and the curve is traced out from left to right as the criterion moves from right to left in Fig. 1. The shape of the ROC is determined by the form of the distributions in the representation, and data largely support the normality assumption. However, the assumption that the SjN and N distributions have the same variance is often incorrect (Swets 1986). In such cases, the area under the ROC is a good measure of accuracy (Swets and Pickett 1982); this index ranges from 0.5 for chance performance to 1.0 for perfect discrimination. An entire ROC (from a rating experiment, for example) is needed to estimate this area.
4. Alternatie Assumptions About Representation The essence of signal detection theory is the distinction between the representation and the decision rule, not the shape of the underlying distributions, and two types of nonnormal distributions have been important. One such distribution is the logistic, which has a shape very similar to the normal but is computationally more tractable. Logistic models have been used for complex experimental designs, and for statistical analysis. Another family of distributions, whose shape is rectangular, leads to rectilinear ROCs and can be interpreted as support for ‘thresholds’ that divide discrete perceptual states. Such curves are unusual, experimentally, but have been found in studies of recognition memory ( Yonelinas and Jacoby 1994). Models using threshold assumptions have been de-
Detection theory is extended easily to experiments with multiple stimuli differing on a single dimension. For example, a set of tones differing only in intensity may be presented one at a time for identification (the observer must identify the exact stimulus presented) or classification (the observer must sort the set into two or more categories). The representation is assumed to be a set of distributions along a decision axis, with mk1 criteria used to assign responses to m categories. SDT analysis allows the calculation of dh (or a related measure, if equal variances are not assumed) for any stimulus pair, and an interesting question concerns the relation between this value and that found in a discrimination task using just those two stimuli. According to a theory of Braida and Durlach (1988), identification dh will be lower than discrimination dh by an amount that reflects memory constraints in the former task. The predicted relation is qualitatively correct for many stimulus dimensions, and can be quantitatively predicted for tones differing in intensity.
6. Complex Discrimination Designs and Multidimensional Detection Theory The one-interval design for measuring discrimination is appropriate for some applications (it seems quite natural in recognition memory, for example, where the SjN trials are studied items and the N trials are foils), but other methods are often preferred. One popular technique is two-alternatie forced-choice (2AFC), in which both stimuli are presented on every trial and the observer must say whether N or SjN occurred first. Another method is same-different, in which there are again two intervals, but each interval can contain either of the two stimuli and the observer must say whether the two stimuli are the same or different. These and many other designs can be analyzed by assuming that each interval is represented by a value on a separate, independent perceptual dimension. Although dh is a characteristic of the stimuli and the observer, and therefore constant across paradigms, performance as measured by percent correct [ p(c)] varies widely (Macmillan and Creelman 1991). For example, Sect. 2 showed that if dh l 1.5, p(c) l 0.77 in the one-interval task; but in 2AFC the same observer would be expected to score p(c) l 0.86, and in the same-different task as low as p(c) l 0.61. The relation between one-interval and 2AFC was one of the first 14077
Signal Detection Theory predictions of SDT to be tested (Green and Swets 1966), whereas that between same-different and other paradigms is a more recent topic of investigation. An interesting aspect of same-different performance is that two distinct decision strategies are available, and their use depends on aspects of the experimental design and stimulus set (Irwin and Francis 1995) Many identification and classification experiments use stimulus sets that themselves vary in multiple dimensions perceptually, and SDT analysis can be extended to this more general case. Multidimensional detection theory has been particularly helpful in clarifying questions about independence (see Ashby 1992) (see Signal Detection Theory: Multidimensional).
7. Comparing Obsered with Ideal or Theoretically-deried Performance Detection theory has provided a forum for asking a slate of questions about the optimality of performance. One set of issues concerns the degree to which human behavior falls below the ideal level set by stimulus variability. Early auditory experiments using external noise revealed that human ‘efficiency’ was often quite high, though never perfect. A different type of application concerns the theoretical prediction of the ratio between the SjN and N standard deviations, which can be derived from the shape of the ROC shape. For example, this ratio should be less than unity for detection experiments in which signal parameters are not known by the observer (Graham 1989). Ratcliff et al. (1994) showed that for an important class of recognition memory models, the ratio should be approximately 0.8, as it is empirically. Some strong assumptions about the decision process that is at the heart of SDT have been tested. In general, these are well supported for one-dimensional representations, but complications sometimes arise for higher dimensionality. Do observers establish fixed criteria? Experiments requiring discrimination between distributions of stimuli have shown that they do when the variability is in one dimension and with adequate experience in two dimensions as well. Do observers set ‘optimal’ criteria? Actual criterion settings are often conservative compared to ideal ones predicted by Bayes’s Theorem. In a multidimensional task, observers appear to adopt appropriate decision boundaries when those are simple in form (e.g., linear), but fall short of optimality when more complex rules are required. Does the representation remain fixed in the face of task variation? In one dimension, this amounts to the original, often-replicated datum that dh (or some sensitivity measure) is unchanged when criterion changes. In more than one dimension, attentional effects can alter the representation (Nosofsky 1986), and the question of what is invariant in these cases is still unresolved. 14078
See also: Decision Theory: Classical; Perception: History of the Concept; Psychophysical Theory and Laws, History of; Signal Detection Theory, History of; Signal Detection Theory: Multidimensional; Time Perception Models
Bibliography Ashby F G (ed.) 1992 Multidimensional Models of Perception and Cognition. Erlbaum, Hillsdale, NJ Braida L D, Durlach N I 1988 Peripheral and central factors in intensity perception. In: Edelman G M, Gall W E, Cowan W M (eds.) Auditory Function. Wiley, New York, pp. 559–83 Graham N V 1989 Visual Pattern Analyzers. Oxford University Press, Oxford, UK Green D M, Swets J A 1966 Signal Detection Theory and Psychophysics. Wiley, New York Irwin R J, Francis M A 1995 Perception of simple and complex visual stimuli: Decision strategies and hemispheric differences in same-different judgments. Perception 24: 787–809 Macmillan N A, Creelman C D 1991 Detection Theory: A User’s Guide. Cambridge University Press, New York Nosofsky R M 1986 Attention, similarity, and the identificationcategorization relationship. Journal of Experimental Psychology: General 115: 39–57 Ratcliff R, McKoon G, Tindall M 1994 Empirical generality of data from recognition memory receiver-operating characteristic functions and implications for the global memory models. Journal of Experimental Psychology: Learning, Memory, and Cognition 20: 763–85 Swets J A 1986 Form of empirical ROCs in discrimination and diagnostic tasks. Psychological Bulletin 99: 181–98 Swets J A, Pickett R M 1982 Ealuation of Diagnostic Systems: Methods from Signal Detection Theory. Academic Press, New York Yonelinas A P, Jacoby L L 1994 Dissociations of processes in recognition memory: Effects of interference and of response speed. Canadian Journal of Experimental Psychology 48: 516–34
N. A. Macmillan
Signal Detection Theory, History of 1. Introduction and Scope Signal detection theory (SDT) sprouted from World War II research on radar into a probability-based theory in the early 1950s. It specifies the optimal observation and decision processes for detecting electronic signals against a background of random interference or noise. The engineering theory, culminating in the work of Wesley W. Peterson and Theodore G. Birdsall (Peterson et al. 1954), had foundations in mathematical developments for theories of statistical inference, beginning with those advanced by Jerzy
Signal Detection Theory, History of Neyman and E. S. Pearson (1933). SDT was taken into psychophysics, then a century-old branch of psychology, when the human observer’s detection of weak signals, or discrimination between similar signals, was seen by psychologists as a problem of inference. In psychology, SDT is a model for a theory of how organisms make fine discriminations and it specifies model-based methods of data collection and analysis. Notably, through its analytical technique called the receiver operating characteristic (ROC), it separates sensory and decision factors and provides independent measures of them. SDT’s approach is now used in many areas in which discrimination is studied in psychology, including cognitive as well as sensory processes. From psychology, SDT and the ROC came to be applied in a wide range of practical diagnostic tasks, in which a decision is made between two confusable alternatives (Swets 1996) (see also Signal Detection Theory).
2. Electronic Detection and Statistical Inference Central to both electronics and statistics is a conception of a pair of overlapping, bell-shaped, probability (density) functions arrayed along a unidimensional variable, which is the weight of evidence derived from observation. In statistical theory, the probabilities are conditional on the null hypothesis or the alternative hypothesis, while in electronic detection theory they are observational probabilities conditional on noise-alone or signal-plus-noise. The greater the positive weight of evidence, the more likely a signal, or significant experimental effect, is present. A cutpoint must be set along the variable to identify those observed values above the cutpoint that would lead to rejection of the null hypothesis or acceptance of the presence of the designated signal. Errors are of commission or omission: type I and type II errors in statistics, and false alarms and misses in detection. Correct outcomes in detection are hits and correct rejections. Various decision rules specify optimal cutpoints for different decision goals. The weight of evidence is unidimensional because just two decision alternatives are considered. Optimal weights of evidence are monotone increasing with the likelihood ratio, the ratio of the two overlapping bell-shaped probability functions. It is this ratio, and not the shapes of the functions, that is important.
of two independent aspects of detection performance: (a) the location of the observer’s cutpoint or decision criterion and (b) the observer’s sensitivity, or ability to discriminate between signal-plus-noise and noisealone, irrespective of any chosen cutpoint. The two measures characterize performance better than the single one often used, namely, the signal-to-noise ratio (SNR) expressed in energy terms that is necessary to provide a 50 percent hit (correct signal acceptance) probability at a fixed decision cutpoint, say, one yielding a conditional false-alarm probability of 0.05. A curve showing conditional hit probability versus SNR is familiar in detection theory and is equivalent to the power function of statistical theory. These curves can be derived as a special case of the ROCs for the SNR values of interest.
4. The ROC The ROC is a plot of the conditional hit (true–positive) proportion against the conditional false-alarm (false– positive) proportion for all possible locations of the decision cutpoint. Peterson’s and Birdsall’s SDT specifies a form of ROC that begins at the lower left corner (where both proportions are zero) and rises with smoothly decreasing slope to the upper right corner (where both proportions are 1.0)—as the decision cutpoint varies from very strict (at the right end of the decision variable) to very lenient (at the left end). This is a ‘proper’ ROC, and specifically not a ‘singular’ ROC, or one intersecting axes at points between 0 and 1.0. The locus of the curve, or just the proportion of area beneath it, gives a measure of discrimination capacity. An index of a point along the curve, for example, the slope of the curve at the point, gives a measure of the decision cutpoint that yielded that point (the particular slope equals the criterial value of likelihood ratio). SDT specifies the discrimination capacity attainable by an ‘ideal observer’ for any SNR for various practical combinations of signal and noise parameters (e.g., signal specified exactly and signal specified statistically) and hence the human observer’s efficiency can be calculated under various signal conditions. SDT also specifies the optimal cutoff as a function of the signal’s prior probability and the benefits and costs of the decision outcomes.
5. Psychology’s Need for SDT and The ROC 3. Modern Signal Detection Theory In the early 1950s, Peterson and Birdsall, then graduate students in electrical engineering and mathematics, respectively, at the University of Michigan, developed the general mathematical theory of signal detection that is still current. In the process they devised the ROC—a graphical technique permitting measurement
Wilson P. Tanner, Jr., and John A. Swets, graduate students in psychology at Michigan in the early 1950s, became aware of Peterson’s and Birdsall’s work on the same campus as it began. Tanner became acquainted through his interest in mathematical and electronic concepts as models for psychological and neural processes and Swets was attracted because of his interest in quantifying the ‘instruction stimulus’ in 14079
Signal Detection Theory, History of psychophysical tasks, in order to control the observer’s response criterion. As they pursued studies in sensory psychology, these psychologists had become dissatisfied with psychophysical and psychological theories based on overlapping bell-shaped functions that assumed fixed decision cutpoints and incorporated measures only of discrimination capacity, which were possibly confounded by attitudinal or decision processes. Prominent among such theories were those of Gustav Theodor Fechner (1860), Louis Leon Thurstone (1927), and H. Richard Blackwell (1963); see Swets (1996, Chap. 1). Fechner and Thurstone conceived of a symmetrical cutpoint, where the two distributions cross, because they presented two stimuli on each trial with no particular valence; that is, both stimuli were ‘signals’ (lifted weights, samples of handwriting). Fechner also studied detection of single signals, but did not compare them to a noise-alone alternative and his theory was essentially one of variable representations of a signal compared to an invariant cutpoint. This cutpoint was viewed as a physiologically determined sensory threshold (akin to all-or-nothing nerve firing) and hence the values of the sensory variable beneath it were thought to be indistinguishable from one another. Blackwell’s task and model were explicitly of signals in noise, but the observer was thought to have a fixed cutpoint near the top of the noise function, for which the false-alarm probability was negligible, and, again, the values below that cutpoint were deemed indistinguishable. For both threshold theorists, the single measure of performance was the signal strength required to yield 50 percent correct positive responses, like the effective SNR of electronic detection theory, and it was taken as a statistical estimate of a sensory threshold. In psychology, the curve relating percentage of signals detected to signal strength is called the psychometric function. There had been some earlier concern in psychology for the effect of the observer’s attitude on the measured threshold, e.g., Graham’s (1951) call for quantification of instruction stimuli in the psychophysical equation, but no good way to deal with non-sensory factors had emerged. In 1952, Tanner and Swets joined Peterson and Birdsall as staff in a laboratory of the electrical engineering department called the Electronic Defense Group. They were aware of then-new conceptions of neural functioning in which stimulus inputs found an already active nervous system and neurons were not lying quiescent till fired at full force. In short, neural noise as well as environmental noise was likely to be a factor in detection and then the observer’s task is readily conceived as a choice between statistical hypotheses. Though a minority idea in the history of psychophysics (see Corso 1963), the possibility that the observer deliberately sets a cutpoint on a continuous variable (weight of evidence) seemed likely to Tanner and Swets. A budding cognitive psychology (e.g., ‘new look’ in perception) supported the notion 14080
of extrasensory determinants of perceptual phenomena, as represented in SDT by expectancies (prior probabilities) and motivation (benefits and costs).
6. Psychophysical Experiments The new SDT was first tested in Swets’s doctoral thesis in Blackwell’s vision laboratory (Tanner and Swets 1954; Swets et al. 1961–see Swets 1964, Chap. 1). It was then tested with greater control of the physical signal and noise parameters in new facilities for auditory research in the electrical engineering laboratory. Sufficiently neat empirical ROCs were obtained in the form of the curved arc specified by SDT. ROCs found to be predicted by Blackwell’s ‘highthreshold’ theory—straight lines extending from some point along the left axis, that depended on signal strength, to the upper right hand corner—clearly did not fit the data. Other threshold theories, e.g., ‘lowthreshold’ theories, did not fare much better in experimental tests (Swets 1961–see Swets 1964, Chap. 4). It was recognized later that linear ROCs of slope l 1, intersecting left and upper edges of the graph symmetrically, were predicted by several other measures and their implicit models (Swets 1996, Chap. 3) and also gave very poor fits to sensory data (and other ROC data to be described; Swets 1996, Chap. 2). After a stint as observer in Swets’s thesis studies, David M. Green joined the engineering laboratory as an undergraduate research assistant. He soon became a full partner in research, and co-authored a laboratory technical report with Tanner and Swets in 1956 that included the first auditory studies testing SDT. Birdsall has collaborated in research with the psychologists over the following decades and commented on a draft of this article. The measure of discrimination performance used first was denoted d’ (‘d prime’) and defined as the difference between the means of two implicit, overlapping, normal (Gaussian) functions for signal-plusnoise and noise-alone, of equal variance, divided by their common standard deviation. A value of d’ can be calculated for each ROC point, using the so-called single-stimulus yes–no method of data collection. Similarly, a value of d’ can be calculated for an observer’s performance under two other methods of data collection: the multiple-stimulus forced-choice method (one stimulus being signal-plus noise and the rest noise-alone) and the single-stimulus confidencerating method. Experimental tests showed that when the same signal and nonsignal stimuli were used with the three methods, essentially the same values of d’ were obtained. The measures specified by threshold theories had not achieved that degree of consistency or internal validity. For many, the most conclusive evidence favoring SDT’s continuous decision variable and rejecting the high-threshold theory came from an experiment sug-
Signal Detection Theory, History of gested by Robert Z. Norman, a Michigan mathematics student. Carried out with visual stimuli, the critical finding was that a second choice made in a fouralternative forced-choice test, when the first choice was incorrect, was correct with probability greater than 1\3 and that the probability of it being correct increased with signal strength (Swets et al. 1961, Swets 1964, Chap. 1). Validating the rating method, as mentioned above, had the important effect of providing an efficient method for obtaining empirical ROCs. Whereas under the yes–no method the observer is induced to set a different cutpoint in each of several observing sessions (each providing an ROC point), under the rating method the observer effectively maintains several decision cutpoints simultaneously (the boundaries of the rating categories) so that several empirical ROC points (enough to define a curve) can be obtained in one observing session.
vigilance task, that is, the practical military and industrial observing task of ‘low-probability watch’ (Egan et al. 1961–see Swets 1964, Chap. 15), and also to speech communication (Egan and Clarke 1957–see Swets 1964, Chap. 30). He also brought SDT to purely cognitive tasks with experiments in recognition memory (see Green and Swets 1966, Sect. 12.9). Applications were then made by others to those tasks, and also to tasks of conceptual judgment, animal discrimination and learning, word recognition, attention, visual imagery, manual control, and reaction time (see Swets 1996, Chap. 1). A broad sample of empirical ROCs obtained in these areas indicates their very similar forms (Swets 1996, Chap. 2). An Annual Review chapter reviews applications in clinical psychology, for example, to predicting acts of violence (McFall and Treat 1999).
9. Applications in Diagnostics 7. Dissemination of SDT in Psychology Despite the growing evidence for it, acceptance of SDT was not rapid or broad in psychophysical circles. This was partly due to threshold theory having been ingrained in the field for a century (as a concept and a collection of methods and measures), and probably because SDT arose in an engineering context (and retained the engineers’ terminology), the early visual data were noisy, and the first article (Tanner and Swets 1954) was cryptic. Dissemination was assisted when J. C. R. Licklider brought Swets and Green to the Massachusetts Institute of Technology where they participated in Cambridge’s hotbed of psychophysics and mathematical psychology, and where offering a special summer course for postgraduates led to a published collection of approximately 35 of the articles on SDT in psychology that had appeared by then, along with tables of d’ (Swets 1964). Licklider also hired both of them part-time at the research firm of Bolt Beranek and Newman Inc., where they obtained contract support from the National Aeronautics and Space Administration to write a systematic textbook (Green and Swets 1966). Meanwhile, Tanner mentored a series of exceptional graduate students at Michigan. All were introduced by Licklider to the Acoustical Society of America where they enjoyed a feverishly intense venue for discussion of their ideas at biannual meetings and in the Society’s journal. Another factor was the active collaboration with Tanner of James P. Egan and his students at Indiana University.
8. Extensions of SDT in Psychology Egan recognized the potential for applying SDT in psychology beyond the traditional psychophysical tasks. He extended it to the less tightly defined
An analysis of performance measurement in information retrieval suggested that two-by-two contingency tables for any diagnostic task were grist for SDT’s mill (Swets 1996, Chap. 9). This idea was advanced considerably by Lusted’s (1968) application to medical diagnosis. A standard protocol for evaluating diagnostic performance via SDT methods, with an emphasis on medical images, was sponsored by the National Cancer Institute (Swets and Pickett 1982). By 2000, ‘ROC’ is specified as a key word in over 1000 medical articles each year, ranging from radiology to blood analysis. Other diagnostic applications are being made to aptitude testing, materials testing, weather forecasting, and polygraph lie detection (Swets 1996, Chap. 4). A recent development is to use SDT to improve, as well as to evaluate, diagnostic accuracy. Observers’ ratings of the relevant dimensions of a diagnostic task—e.g., perceptual features in an X-ray—are merged in a manner specified by the theory to yield an optimal estimate of the probability that a ‘signal’ is present (Swets 1996, Chap. 8). The latest work both in psychology and diagnostics has been described didactically (Swets 1998). See also: Psychophysical Theory and Laws, History of; Signal Detection Theory; Signal Detection Theory: Multidimensional
Bibliography Blackwell H R 1963 Neural theories of simple visual discriminations. Journal of the Optical Society of America 53: 129–60 Corso J F 1963 A theoretico-historical review of the threshold concept. Psychological Bulletin 60: 356–70
14081
Signal Detection Theory, History of Fechner G T 1860 Elemente der Psychophysik Breitkopf & Hartel, Leipzig, Germany [English translation of Vol. 1 by Adler H E 1966. In: Howes D H, Boring E G (eds.) Elements of Psychophysics. Holt, Rinehart and Winston, New York] Graham C H 1951 Visual perception. In: Stevens S S (ed.) Handbook of Experimental Psychology. Wiley, New York Green D M, Swets J A 1966 Signal Detection Theory and Psychophysics. Wiley, New York. Reprinted 1988 by Peninsula, Los Altos Hills, CA Lusted L B 1968 Introduction to Medical Decision Making. C. C. Thomas, Springfield, IL McFall R M, Treat T A 1999 Quantifying the information value of clinical assessments with signal detection theory. Annual Reiew of Psychology 50: 215–41 Neyman J, Pearson E S 1933 On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London A231: 289–311 Peterson W W, Birdsall T G, Fox W C 1954 The theory of signal detectability. Transactions of the Institute of Radio Engineers Professional Group on Information Theory, PGIT 4: 171–212. Also In: Luce R D, Bush R R, Galanter E 1963 (eds.) Readings in Mathematical Psychology. Wiley, New York, Vol. 1 Swets J A (ed.) 1964 Signal Detection and Recognition by Human Obserers. Wiley, New York. Reprinted 1988 by Peninsula, Los Altos Hills, CA Swets J A 1996 Signal Detection Theory and ROC Analysis in Psychology and Diagnostics: Collected Papers. L. Erlbaum Associates, Mahwah, NJ Swets J A 1998 Separating discrimination and decision in detection, recognition, and matters of life and death. In: Osherson D (series ed.) Scarborough D, Sternberg S (vol. eds.) An Initation to Cognitie Science: Vol 4, Methods, Models, and Conceptual Issues. MIT, Cambridge, MA Swets J A, Pickett R M 1982 Ealuation of Diagnostic Systems: Methods from Signal Detection Theory. Academic Press, New York Tanner W P Jr, Swets J A 1954 A decision making theory of visual detection. Psychological Reiew 61: 401–9 Thurstone L L 1927 A law of comparative judgment. Psychological Reiew 34: 273–86
J. A. Swets
Signal Detection Theory: Multidimensional Multidimensional signal detection theory (MSDT) is an extension of signal detection theory (SDT) to more than one dimension, with each dimension representing a different source of information along which the ‘signal’ is registered. The development of MSDT was motivated by psychologists who wanted to study information processing of more realistic, multidimensional stimuli, for example, the shape and color of geometric objects, or pitch, loudness, and timbre of tones. The typical questions of interest are: Are the dimensions (e.g., color and shape of an object) processed independently, or does the perception of 14082
one dimension (e.g., color) depend on the perception of the second (shape), and if so, what is the nature of their dependence? In MSDT, several types of independence of perceptual dimensions are defined and related to two sets (marginal and conditional) of sensitivity, dh, and response bias, β, parameters. This article gives the definitions, and outlines the uses in studying information processing of dimensions. MSDT is also known as General Recognition Theory (Ashby and Townsend 1986) or Decision Bound Theory (e.g., Ashby and Maddox 1994).
1. Introduction and Notation Stimulus dimensions, A, B,…, represent different types of information about a class of stimuli, for example, in a two-dimensional (2-D) case, the shape (A) and color (B) of objects. The researcher specifies the levels on each dimension to study (e.g., square and rectangle are levels of shape, and blue, turquoise, and green are levels of color). A multidimensional ‘signal’ is stimulus AiBj, where the subscript indicates the specific level on the corresponding dimension (e.g., a green square is A B ). For each physical stimulus dimension A (B,…), " $ is assumed a corresponding unique psychological there or perceptual dimension X (Y,…). The minimal stimulus set considered in this article consists of four 2-D stimuli, oA B , A B , A B , A B q, " " #on #dimension " # # constructed by combining two" levels A (i l 1, 2) with each of two levels on dimension B ( j l 1, 2). For example, the set constructed from two levels of color (blue, green) and two levels of shape (square, rectangle) is oblue square, blue rectangle, green square, green rectangleq. On a single trial, one stimulus is selected from the set at random, and it is presented to the observer whose task is to identify the stimulus. The presentation of a 2-D stimulus results in a percept, a point (x, y) in X-Y-perceptual space. The observer’s response is based on the percept’s location relative to the observer’s placement of decision criterion bounds in the perceptual space, typically one bound per dimension, denoted cA, cB, … . These bounds divide the perceptual space into response regions such that if, for example, (x, y) falls above cA and below cB, the (2-D) response would be ‘21’ (or any response that uniquely identifies it as A B ). # " As in unidimensional SDT, multiple presentations of a single stimulus are assumed to lead to a distribution of percepts, and one distribution (usually Gaussian) is assumed for each stimulus. An example with four 2-D stimuli is illustrated in Fig. 1. The bivariate joint densities, fA B (x, y), are represented j more easily as equal-density icontours (Fig. 1(b)), that are obtained by slicing off the tops of the joint densities in Fig. 1(a) at a constant height, and viewing the space
Signal Detection Theory: Multidimensional (a)
responded ‘11’ when in fact stimulus A B was " # presented is
y
f (x,y)
P(R QA B ) l "" " #
&&
cA cB
_
_
fA B (x, y) dx dy
(1)
" #
x (b)
2. Definitions of Independence of Dimensions
y
fA B CA
Several definitions of independence of dimensions are defined within MSDT. These are: perceptual separability (PS), decisional separability (DS), and perceptual independence (PI).
Dimension B
1 2
Level 2 fA B
CB
2 2
fA B
1 1
Level 1
fA B 2 1 x Level 1
2.1 Perceptual Separability (PS ) and Decisional Separability (DS)
Level 2
Dimension A (c) gA B (x) = gA B (x) 1 1
1 2
gA B (x) 2 1
gA B (x) 2 2
x
Figure 1 (a) The perceptual space showing four joint bivariate densities, fA B (x, y), that represent all possible i j (theoretical) perceptual effects for each of the four 2-D stimuli AiBj; (b) Equal density contours of the bivariate densities in (a) and the decision bounds cA and cB; (c) Marginal densities, gA B (x), representing perceptual i j effects for each stimulus AiBj on dimension A only. In this example, gA B (x) gA B (x) but gA B (x)gA B (x) " "
" #
# "
Dimension A is PS from dimension B if the perceptual effects of A are equal across levels of dimension B, that is, gA B (x) l gA B#(x), for all values of x and " both i l 1 iand i l 2.i Similarly, dimension B is perceptually separable from dimension A if gA B " j (x) l gA B (x), for all y, and j l 1, 2. DS is a# jsimilar type of independence of the decisions that the observer makes about the stimuli. DS holds for dimension A (or B) when the decision bound cA (or cB) is parallel to the Y- (or X-) perceptual axis. PS and DS are global types of independence defined for perceptual dimensions across levels of the second dimension for all stimuli in the set. They can also be asymmetrical relations. In Fig. 1(b), for example, PS holds for dimension B across levels of dimension A, but PS fails for dimension A across levels of B; also, DS holds for dimension A but fails for dimension B.
# #
from ‘above.’ The joint density for each stimulus AiBj yields: (a) Marginal densities on each dimension, gA B (x) and gA B (y), obtained by integrating over all i j j values of the i second dimension (see Fig. 1(c) for marginal densities gA B (x) for dimension A); and (b) j Conditional densities ifor each dimension (not shown), conditioning on the observer’s response to the second dimension, and obtained by integrating only over values either aboe (or below) the decision bound on the second dimension, for example, gA B (xQy cB). i j An observer’s responses are tabulated in a 4i4 confusion matrix, a table of conditional probability estimates, P(RpQAiBj), that are proportions of trials that response Rp was emitted when stimulus AiBj was presented. These proportions estimate the volumes under each joint density in each response region. For example, the proportion of trials on which the observer
2.2 Perceptual Independence (PI ) PI is distinguished from PS in that it is a local form of independence among dimensions within a single stimulus. PI holds within stimulus AiBj if fA B (x, y) l gA B (x) gA B (y) i j
i j
i j
for all (x, y). This is statistical independence, and it is a symmetric relation in a given stimulus. The shape of an equal density contour describes the correlation among the perceived dimensions in one stimulus. A circle indicates PI in a stimulus with variances equal on the two dimensions. An ellipse with its major and minor axes parallel to the perceptual axes also indicates PI, but with unequal variances on the two dimensions. A tilted ellipse indicates failure of 14083
Signal Detection Theory: Multidimensional PI, with the direction of the tilt (positive or negative slope of the major axis) indicating the direction of the dependence. In Fig. 1(b), stimuli A B and A B show " # " and PI, stimulus A B shows a positive" dependence, # # a negative dependence. stimulus A B shows " #
In a parallel fashion, marginal βs are defined as β(A at Bj) l
gA B (cA) # j
gA B (cA)
for j l 1, 2
" j
and
3. Marginal and Conditional dhs and βs As in unidimensional SDT, sensitivity, dh, and response bias, β, are parameters that describe different characteristics of the observer. Two sets of these parameters are defined: (a) marginal dhs and βs, and (b) conditional dhs and βs. The marginal and conditional dhs and βs for the four-stimulus set discussed here can be generalized to any number of dimensions and any number of levels: for details, see Kadlec and Townsend (in Ashby 1992, Chap. 8).
3.1 Marginal dhs and βs Marginal dhs and βs are defined using the marginal densities, gA B (x) and gA B (y). Let the mean vector and i j i j variance-covariance matrix of fA B (x, y), for stimulus i j AiBj, be denoted by A
µ(A B ) l i j
µx(A B ) µy(A B ) i j
Σ(A B ) l i j
B
σxy(A B )σ#y(A B ) i j
(2) D
respectively. The marginal dh for dimension A at level j of dimension B is dh(A at Bj) l
µx(A B )kµx(A B ) # j
σx(A B )
" j
for j l 1, 2 (3)
14084
where cA and cB are the decision bounds for dimension A and B, respectively. 3.2 Conditional dhs and βs In the four-stimulus set, two pairs of conditional dhs and βs are defined for each dimension, conditioning on the observer’s response on the second. Using the terminology of unidimensional SDT for simplicity, with level 1 indicating the ‘noise’ stimulus and level 2 the ‘signal’ stimulus, one pair of conditional dhs for dimension A, with B presented at level 1, is: (a) dh(AQ‘correct rejection’ on B)—conditioned on the observer’s correct response that B was at level 1, defined using parameters of cB),
i "
in Eqn. (3). The second pair of conditional dhs for dimension A, with B presented at level 2, is: (c) dh(A Q ‘hit’ on B)—conditioned on the correct response that B was at level 2; and (d) dh(AQ‘miss’ on B)—conditioned on the incorrect response that B was at level 1. Similar pairs are defined for conditional βs, and for dhs and βs for dimension B conditional on dimension A responses.
" j
The subscript x indicates that for a dh on dimension A, only the X-coordinates of the mean vectors and the standard deviation along the X-perceptual axis are employed. As in unidimensional SDT, the standard deviation of the ‘noise’ stimulus (here indicated as level 1 of dimension A) is used, and without loss of generality, it is assumed that µ(A B ) l [0 0]T (T denotes " " σ# l σ# l 1, and transpose) and Σ(A B ) l I (i.e., x y " " σxy l 0). Similarly, the marginal dhs for dimension B at level i of dimension A is dh(B at Ai) l
(5)
gA B (xQy cB)
C
i j
i j
for i l 1, 2
i "
i l 1 and 2 in Eqn. (3); and (b) dh(AQ‘false alarm’ on B)—conditioned on the incorrect response that B was at level 1, using parameters of
D
σ#x(A B ) σxy(A B ) i j
i #
gA B (cB)
i "
and A
gA B (cB)
gA B (xQy
C
i j
B
β(B at Ai) l
µy(A B )kµy(A B ) i #
σy(A B ) i "
i "
for i l 1, 2 (4)
4. Using the d hs and βs to Test Dimensional Interactions These dh and β parameters are linked theoretically to PS, DS, and PI (Kadlec and Townsend 1992). Their estimates, obtained from the confusion matrix proportions (Eqn. (1)), can thus be used to draw tentative inferences about dimensional interactions (Kadlec 1999). DS for dimension A (or B) is inferred when cA (cB) estimates are equal at all levels of dimension B (A). Equal marginal dhs for a dimension across levels of the other are consistent with PS on that dimension. A stronger inference for PS on both dimensions can be drawn if the densities form a rectangle (but not a
Significance, Tests of parallelogram), and this condition is testable if DS and PI hold (Kadlec and Hicks 1998; see also Thomas 1999 for new theoretical developments). When DS holds, evidence for PI is gained if conditional dhs are found to be equal: Conversely, unequal conditional dhs indicate PI failure, and the direction of the inequality provides information about the direction of the dimensional dependencies in the stimuli. Exploring these relationships and looking for stronger tests of PS and PI is currently an active area of research. MSDT is a relatively new development, and with its definitions of independence of dimensions, it is a valuable analytic and theoretical tool for studying interactions among stimulus dimensions. Current applications include studies of (a) human visual perception (of symmetry, Gestalt ‘laws’ of perceptual grouping, and faces); and (b) source monitoring in memory. It can be used in any research domain where the study of dimensional interactions is of interest. See also: Memory Psychophysics; Multidimensional Scaling in Psychology; Psychometrics; Psychophysics; Signal Detection Theory; Signal Detection Theory, History of; Visual Perception, Neural Basis of
Bibliography Ashby F G (ed.) 1992 Multidimensional Models of Perception and Cognition. Hillsdale, Erlbaum, NJ Ashby F G, Maddox W T 1994 A response time theory of separability and integrality in speeded classification. Journal of Mathematical Psychology 38: 423–66 Ashby F G, Townsend J T 1986 Varieties of perceptual independence. Psychology Reiew 93: 154–79 Kadlec H 1999 MSDA-2: Updated version of software for multidimensional signal detection analyses. Behaior Research Methods, Instruments & Computers 31: 384–5 Kadlec H, Hicks C L 1998 Invariance of perceptual spaces. Journal of Experimental Psychology: Human Perception & Performance 24: 80–104 Kadlec H, Townsend J T 1992 Implications of marginal and conditional detection parameters for the separabilities and independence of perceptual dimensions. Journal of Mathematical Psychology 36: 325–74 Thomas R D 1999 Assessing sensitivity in a multidimensional space: Some problems and a definition of a general dh. Psychonomic Bulletin & Reiew 6: 224–38
H. Kadlec
Significance, Tests of A recent article on the health benefits of herbal tea reported its use leading to a decreased incidence of insomnia in an experiment conducted at a sleep disorders clinic. Patients at the clinic were randomly
assigned to daily consumption of herbal tea or a caffeine-free beverage of their choice, and were followed up for 10 months. The reported improvement was stated to be ‘statistically significant ( p 0.05).’ The implication intended was that the improvement should be attributed to the beneficial effects of the tea. In fact, this article does not really exist, but examples of this sort are reported regularly in science articles in the press, and are very common in journal publications in a number of fields in science and social science. It has even been argued, in the press, that ‘statistical significance’ has become so widely used to give an imprimatur of scientific acceptability that it is in fact misused and should be abandoned (Matthews 1998). A test of significance assesses the agreement between the data and an hypothesized statistical model. The magnitude of the agreement is expressed as an observed level of significance, or p-value, which is the probability of obtaining data as or more extreme than the observed data, if the hypothesized model were true. A very small p-value suggests that either the observed data are not compatible with the hypothesized model or an event of very small probability has been observed. A large p-value indicates that the observed data are compatible with the hypothesized model. Since the p-value is a probability, it must be between 0 and 1, and a very common convention is to declare values smaller than 0.05 as ‘small’ or ‘statistically significant’ and values larger than 0.05 as ‘not statistically significant.’ In this section we shall try to explain the historical rationale of this very arbitrary cut-off point. To make these vague definitions more concrete, it is necessary to consider statistical models and the notion of a (simplifying) hypothesis within that model. The theory of this is outlined in Sect. 2 with some more advanced topics considered in Sect. 3. This section concludes with a highly idealized example to convey the idea of data being inconsistent with an hypothesized model. Example 1. Students in a statistics class partake in an activity to assess their ability to distinguish between two competing brands of cola, and to identify their preferred brand from taste alone. Each of the 20 students expresses a preference for one brand or the other, but just one student claims to be able to discriminate perfectly between the two. Twenty cups of each brand are prepared by the instructor and labelled ‘1’ and ‘2.’ Each student is to record which label corresponds to Brand A. The result is that 12 students correctly the competing brands identify, although not the student who claimed a perfect ability to discriminate. The labelling of the cups as 1 or 2 by the instructor was completely random, i.e., cup 1 was equally likely to contain Brand A or B. The students did not discuss their opinions with their classmates, and the taste testing was completed fairly quickly. Under these conditions, it is plausible that each student has a 14085
Significance, Tests of Table 1 Probability of zero or one mistakes in n independent Bernoulli trials with probability of a mistake l 0.5 n
Probability
5 6 7 8 9 10 11 12 13 14 15
0.1875 0.1094 0.0625 0.0352 0.0195 0.0107 0.0059 0.0032 0.0017 0.0009 0.0002
plays the role of a conservative position that the experimenter hopes to disprove, and one reason for requiring rather small p-values before declaring statistical significance is to raise the standard of proof required to replace a relatively simple working hypothesis with one that is possibly more complex and less well understood. As formulated here, the hypothesis being tested is that the probability of a correct choice is 0.5, and not the other aspects of the model, such as independence of the trials, and unchanging probability of success. The number of observed successes does not measure such model features; it provides information; only on the probability of success. Functions of the data that do measure such model features can be constructed, and from these significance tests that assess the fit of an assumed model; these play an important role in statistical inference as well.
probability of " of identifying the brands correctly # simply by guessing, so that about 10 students would identify the brands correctly with no discriminatory ability at all. That 12 students did so does not seem inconsistent with guesswork, and the p-value helps to quantify this. The probability of observing 12 or more correct results if one correct result has probability " and the guesses are independent can be computed by# the binomial formula as
(020121j020131j…j020201*("#)#! l 0.34.
(1)
This is not at all an unlikely event, so there is no evidence from these data that the number of correct answers could not have been obtained by guessing: in more statistical language, assuming a binomial model, the observed data is consistent with probability of success ". # The student who claimed to have perfect discrimination, but actually guessed incorrectly, argued that her abilities should not be dismissed on the basis of one taste test, so the class carried out some computations to see what the p-value for the same observed data would be if the number of trials was increased. The probability of one or zero mistakes in a set of n trials for various values of n, is given in Table 1. From this we see that, for example, one or no mistakes in five trials is consistent with guesswork, but the same result in 10 trials is much less so. In both parts of this example we assumed a model of independent trials, each of which could result in a success or failure, with constant probability of success. Our calculations also assumed this constant probability of success was 0.5. This latter restriction on the model is often called a ‘null hypothesis’, and the test of significance is a test of this null hypothesis; the p-value measures the consistency of the data with this null hypothesis. In many applications the null hypothesis 14086
1. Model Based Inference 1.1 Models and Null Hypothesis We assume that we have a statistical model for a random variable Y taking values in a sample space , described by a parametric family of densities o f ( y; θ ); θ ? Θq. Tests of significance can in fact be constructed in more general settings, but this framework is useful for defining the main ideas. If Y is the total number of successes in n independent Bernoulli trials with constant probability of success, then f ( y; θ ) l
0ny1θ (1kθ ) y
n−y
(2)
Θ l [0, 1] and l o0, 1,…, nq. If Y is a continuous random variable following a normal or bell curve distribution with mean θ and variance θ#, then # " f ( y; θ ) l
(
*
1 1 exp k ( ykθ )# " N2πθ 2θ# # #
(3)
Θ l 2i2 +, and l 2. The model for n independent observations from this distribution is
(
*
1 1 n f ( y , …, yn; θ ) l exp ( y kθ )# " " (N2π)nθ n 2θ# i = i # # "
(4)
Θ l 2i2+, and l 2 n. For further discussion of statistical models, see Statistical Sufficiency; Distributions, Statistical: Special and Discrete; Distributions, Statistical: Approximations; Distributions, Statistical: Special and Continuous.
Significance, Tests of As noted above, we assume the model is given, and our interest is in inference about the parameter θ. While this could take various forms, a test of significance starts with a so-called null hypothesis about θ, of the form H :θlθ ! !
(5)
H :θ?Θ . ! !
(6)
or
In Eqn. (5) the parameter θ is fully specified, and H is ! called a point null hypothesis or a simple null hypothesis. If θ is not fully specified, as in Eqn. (6) H is ! called a composite null hypothesis. In the taste-testing examples the simple null hypothesis was θ l 0.5. In the normal model, Eqn. (3), a hypothesis about the mean, such as H : θ l 0, is composite, since the ! " variance is left unspecified. Another composite null hypothesis is H : θ l θ , which restricts the full parameter space! to# a "one-dimensional curve in i +. A test is constructed by choosing a test statistic which is a function of the data that in some natural way measures departure from what is expected under the null hypothesis, and which has been standardized so that its distribution is known either exactly or to a good approximation under the null hypothesis. Test statistics are usually constructed so that large values indicate a discrepancy from the hypothesis. Example 2. In the binomial model (2), the distribution of Y is completely specified by the null hypothesis θ l 0.5 as f ( y) l
0ny12
−n
(7)
and consistency of a given observed value y of y, is ! measured by the p-value n (ny) 2−n, the probability y=y of observing a value as or more! extreme than y . If y ! be! is quite a bit smaller than expected, then it would more usual to compute the p-value as y! (ny) 2−n. y= Each of these calculations was carried out! in the discussion of taste testing in Sect. 1. Example 3. In independent sampling from the normal distribution, given in Eqn. (4), we usually test the composite null hypothesis H : θ l θ , by con! " "! structing the t-statistic T l Nn ( Yz kθ )\S "!
(8)
where Yz l n−"n Yi and S # l (nk1)−"( YikYz )#. i=" Under H , T follows a t-distribution on nk1 degree of ! freedom, and the p-value is Pro T Nn ( y` kθ )\sq "!
where y` and s are the values observed in the sample. This probability needs to be computed numerically from an expression for the cumulative distribution function of the t-distribution. Historically, tables of this distribution were provided for ready reference, typically by identifying a few critical values, such as t . , t . , and t . satisfying Pr oTν tαq l α, where Tν ! !& ! !" is! "! a random variable following a t-distribution on ν degrees of freedom. It was arguably the publication of these tables that led to a focus on the use of particular fixed levels for testing in applied work. Example 4. Assume the model specifies that Y ,…, Yn are independent, identically distributed from a "distribution with density f (:) on 2 and that we are interested in testing whether or not f (:) is a normal density: H : f ( y) l (N2π)−" e−"# y# !
(9)
or H : f ( y) l (N2π)−"θ−" exp (k(1\2θ#) o ykθ q#) (10) # " ! # the former is a simple and the latter is a composite null hypothesis. For this problem it is less obvious how to construct a test statistic or how to choose among alternative test statistics. Under Eqn. (9) we know the distribution of each observation (standard normal) and thus any function of the observations. The ordered values of Y, Y( ) ( Y(n), could be compared to " their expected values under Eqn. (9), for example by plotting one against the other, and deviation of this plot from a line with intercept 0 and slope 1 could be measured in various ways. In the case of the composite null hypothesis Eqn. (10), we could make use of the result that under H , ( YikYz )\S has a distribution free of θ and θ , and! the vector of these residuals is " #of the pair (Yz , S#), and then for example independent compare the skewness n−"o( Y kYz )\Sq$ with that " expected under normality. Example 5. Suppose we have a sample of independent observations Y ,…, Yn or a circle of radius 1 and our null hypothesis" is that the observations are uniformly distributed on the circle. One choice of a test statistic is T l n cos Yi, very large positive (or i=" negative) values "indicating a concentration of observations at angle 0 (or π). If, instead, we wish to detect clumps of observations at two angles differing by π, then T l n ocos (2Yi)k1q would be more approi=" priate. #The exact distribution of T under H is not " ! available in closed form, but the mean and variance are readily computed as 0 and n, so a normal approximation might be used to compute the p-value. In the examples described above, the test statistics are ad hoc choices likely to be large if the null hypothesis is not true; these are called pure tests of significance. Clearly, a more sensitive test can be constructed if we have more specific knowledge of the 14087
Significance, Tests of likely form of departures from the null hypothesis. The theory of hypothesis testing formalizes this by setting up a null hypothesis and alternative hypothesis, and seeking to construct an optimal test for discriminating between them. See Hypothesis Testing in Statistics. In the remainder of this section we consider a less formal approach based on the likelihood function.
1.2 Significance Tests in Parametric Models In parametric models, tests of significance are often constructed by using the likelihood function, and the p-value is computed by using an established approximation to the distribution of the test statistic. The likelihood function is proportional to the joint density of the data, L (θ; y) l c( y) f ( y; θ ).
(11)
We first suppose that we are testing the simple null hypothesis H : θ l θ in the parametric model f ( y; θ ). ! ! Three test statistics often constructed from the likelihood function are the Wald or maximum likelihood statistic we l (θkθ )T j(θV ) (θV kθ ), ! !
(12)
the Rao or score statistic wu l U (θ
)T o
!
j(θV )q−" U (θ ), !
(13)
and the likelihood ratio statistic w l 2o%(θV )k%(θ )q !
(14)
where in Eqns. (12), (13), and (14) the following notation is used: sup L (θ; y) l L (θV ; y)
(15)
θ
%(θ ) l log L (θ ) U (θ ) l %h(θ ) j (θ ) l k%d(θ ).
(16) (17)
The distributions of each of the statistics Eqns. (12), (13), and (14) can be approximated by a χ k# distribution, where k is the dimension of θ in the model. This relies on being able to apply a central limit theorem to U(θ ), and to identify the maximum likelihood estimator θ# with the root of the equation U(θ ) l 0. The precise regularity conditions needed are somewhat elaborate; see, for example, Lehmann and Casella (1998, Chap. 6) and references therein. The important point is that under the simple null hypothesis the 14088
distributions of each of these test statistics is exactly known, and p-values readily computed. In the case that the hypothesis is composite, a similar triple of test statistics computed from the likelihood function is available, but the notation needed to define them is more elaborate. The details can be found in, for example, Cox and Hinkley (1974, Chap. 9, Sect. 3) and the notation above follows theirs. If θ is a one-dimensional parameter, then a onesided version of the test statistics given in Eqns. (12), (13), and (14) can be used instead, since the square root of we, wu, or w follows approximately a standard normal distribution. It is rare that the exact distribution of test statistics can be computed, but the normal or chi-squared approximation can often be improved, and this has been a subject of much research in the theory of statistics over the past several years. A good booklength reference is Barndorff-Nielsen and Cox (1994). One result of this research is that among the three test statistics the square root of the likelihood-ratio statistic w is generally preferred on a number of grounds, including the accuracy of the normal approximation to its exact distribution. This is true for both simple and composite tests of a scalar parameter.
1.3 Significance Functions and Posterior Probabilities We can also use a test of significance to consider the whole set or interval of values of θ that are consistent with the data. If θ is scalar, one of the simplest ways to do this is to compute r (θ ) l pNw (θ ) as a function of θ, and tabulate or plot Φ or (θ )q against θ, choosing the negative root for θ θ# , and the positive square root otherwise. This significance function will in regular models decrease from one to zero as θ ranges over an interval of values. The θ values for which Φ(r ) is 0.975 and 0.025 provide the endpoints of an approximate 95 percent confidence interval for θ. In a Bayesian approach to inference it is possible to make probability statements about the parameter or parameters in the model by constructing a posterior probability distribution for them. In a model with a scalar parameter θ based on a prior π(θ ) and model f ( y; θ ) we compute a posterior density for θ as π(θ Q y) ` f ( y; θ ) π(θ )
(18)
and can assess any particular value θ by computing !
& π(θ Q y) dθ _ θ
!
called the posterior probability of θ being larger than θ . This posterior probability is different from a p!
Significance, Tests of value: a p-value assesses the data in light of a fixed value of θ, and the posterior probability assesses a fixed value of θ in light of the probability distribution ascribed to the parameter. Many people find a posterior probability easier to understand, and indeed often interpret the p-value in this way. There is a literature on choosing priors called ‘matching priors’ to reconcile these two approaches to inference; recent developments are perhaps best approached from Kass and Wasserman (1996). See also Bayesian Statistics.
hypothesis across the collection of datasets. This is one of the ideas behind meta analysis; see Meta-analysis: Oeriew and Meta-analysis: Tools. One difficulty is that the studies will nearly always differ in a number of respects that may mean they are not all measuring the same parameter, or measuring it in the same way. Another difficulty is that studies for which the p-value is not ‘statistically significant’ will not have been published, and thus are unavailable to be included in a meta analysis. This selection effect may seriously bias the results of the meta analysis.
1.4 Hypothesis Testing Little has been said here about the choice of a test statistic for carrying out a test of significance. The difficulty is that the theory of significance testing provides no guidance on this choice. The likelihood based test statistics described above have proved to be reasonably effective in parametric models, but in a more complicated problem, such as testing the goodness of fit of an hypothesized model, this approach is often not available. To make further progress in the choice of test statistics, the classical approach is to formulate a notion of a ‘powerful’ test statistic, i.e., one that will reliably led to small p-values when the null hypothesis is not correct. To do this in a systematic way requires specifying what model might hold if in fact the null hypothesis is incorrect. In parametric models where the null hypothesis is H : θ l θ the ! ! alternative may well be Ha : θ θ . In more general ! H : ‘the model is settings the null hypothesis might be normal’ and the alternative Ha : ‘the! model is not normal.’ Even in the parametric setting if θ is a vector parameter it may be necessary to consider what direction in the parameter space away from θ is of ! interest. The formalization of these ideas is the theory of hypothesis testing, which considers both null and alternative hypotheses, and optimal choices of test statistics. See Hypothesis Testing in Statistics; Goodness of Fit: Oeriew.
2. Further Topics 2.1 Combining Tests of Significance The p-value is a function of the data, taking small values when the data are incompatible with the null hypothesis, and vice versa. As a function of y the pvalue itself has a distribution under the model f ( y; θ ) and in particular under the null hypothesis H has the ! uniform distribution on the interval (0, 1). In principle, then, if we have computed p-values from a number of different datasets the p-values can be compared to observations from a U(0, 1) distribution with the objective of obtaining evidence of failure of the null
2.2 Sample Size The p-value is a decreasing function of the size of the sample, so that a very large study is more likely to show ‘statistical significance’ than a smaller study. This has led to considerable criticism of the p-value as a summary measure. Some statisticians have argued that, for this reason, posterior probabilities are a better measure of disagreement with the null hypothesis; see, for example, Berger and Sellke (1987). To some extent the criticism can be countered by noting that the p-value is just one summary measure of a set of data, and excessive reliance on one measure is inappropriate. In a parametric setting it is nearly always advisable to provide, along with p-value for testing a particular value of the parameter of interest, an estimate of the observed effect and an indication of the precision of this estimate. This can be accomplished by reporting a significance function, if the parameter of interest is one-dimensional. At a more practical level, it should always be noted that a small pvalue should be interpreted in the context of other aspects of the study. For example a p-value of less than 0.05 could be obtained by a very small difference in a study of 10,000 cases or a relatively larger difference in a study of 1,000. While a 1 percent reduction may be of substantial importance for some scientific contexts, this needs to be evaluated in its context, and not by relying on the fact that it is ‘statistically significant.’ Unfortunately, the notion that a study report is complete if and only if the p-value is found to be less than 0.05 is fairly widely ingrained in some disciplines, and indeed forms a part of the requirements of some government agencies for approving new treatments. Another point of confusion in the evaluation of pvalues for testing scalar parameters is the distinction sometimes made between one-sided and two-sided tests of significance. A reliable procedure is to compute the p-value as twice the smaller of the probabilities that the test statistic is larger than or smaller than the observed value, under the null hypothesis. This socalled two-sided p-value measures disagreement with the null hypothesis in two directions away from the null hypothesis, towards the alternative that the, say, new treatment is worse than the old treatment as well 14089
Significance, Tests of Table 2 Hypothetical data from a small fictitious experiment on insomnia
Herbal tea Other beverage Total
Decrease in insomnia reported
No decrease reported
No. of participants
x 13kx
10kx xk3
10 10
13
7
20
as better. In the absence of very concrete a priori evidence that the alternative hypothesis is genuinely one-sided, this p-value is preferable. In testing hypotheses about parameters of dimension larger than one, it can be difficult, as noted above, to define the relevant direction away from the null hypothesis. One solution is to identify the parameters of interest individually, and carry out separate tests on those parameters. This will usually be effective unless the dimension of the parameter of interest is unusually large, but in many applications it would be rare to have more than a dozen such parameters. Extremely large datasets such as arise in so-called data mining applications raise quite new problems of inference, however.
Table 3 The set of achievable p-values from Table 2, as a function of x. The p-value for Table 2 is given by the formula min("!, "$−x)p(s), where p(s) l ("!s ) ( "!−s)\(#!). s=x "$ "$ The mid p-value is given by (1\2)p (x)jmin("!, "$−x)p (s) s = x+" x
p (x) p-value Mid p-value
10
9
8
7
0.0015 0.0015 0.0007
0.0271 0.0286 0.0150
0.1463 0.1749 0.1018
0.3251 0.5000 0.3561
shows a hypothetical 2i2 table for the herbal tea experiment. The null hypothesis is that the two beverages are equally effective at reducing insomnia, and is here tested using Fisher’s exact test. The relevant calculations are produced in Table 3, from which we see that the set of achievable p-values is relatively small. Some authors have argued that for such highly discrete situations a better assessment of the null hypothesis can be achieved by the use of Barnard’s mid p-value, which is (1\2) Pr (X l x)jPr (X x). See Agresti (1992) and references therein.
3. Conclusion 2.3 Fixed Leel Testing The problem of focusing on one or two so-called critical p-values is sometimes referred to as fixed-level testing. This was useful when computation of p-values was a very lengthy exercise, and it was usual to provide tables of critical values. It is now usually a very routine matter to compute the exact p-value, which is usually (and should be) reported along with other details such as sample size, estimated effect size, and details of the study design. There is still in some quarters a reliance on fixed level testing, with the result that studies for which the p-value is judged ‘not statistically significant’ may not be published. This is sometimes called the ‘file drawer problem,’ and a quantitative analysis was considered in Dawid and Dickey (1977). More recently there has been a move to make the results of inconclusive studies available over the Internet, and if this becomes widespread practice will alleviate the file drawer problem. This issue is particularly important for meta analysis.
2.4 Achieable p-Values In some problems where the distribution is concentrated on a discrete set, the number of available pvalues will be relatively small. For example, Table 2 14090
A test of statistical significance is a mathematical calculation based on a test statistic, a null hypothesis, and the distribution of the test statistic under the null hypothesis. The result of the test is to indicate whether the data are consistent with the null hypothesis: if they are not, then either we have observed an event of low probability, or the null hypothesis is not correct. The choice of test statistic is in principle arbitrary, but in practice might be determined by convention in the field of application, by intuition in a relatively new setting, or by one or more considerations developed in statistical theory. It is convenient to use test statistics whose distributions can easily be calculated exactly or to a good approximation. It is useful to use a test statistic that is sensitive to the particular departures from the null hypothesis that are of particular interest in the application. A test of statistical significance is just one component of the analysis of a set of data, and should be supplemented by estimates of effects of interest, considerations related to sample size, and a discussion of the validity of any assumptions of independence or underlying models that have been made in the analysis. A statistically significant result is not necessarily an important result in any particular analysis, but needs to be considered in the context of research in that field. An eloquent introduction to tests of significance is given in Fisher (1935, Chap. 2). Kalbfleisch (1979,
Simmel, Georg (1858–1918) Chap. 12) is a good textbook reference at an undergraduate level. The discussion here draws considerably from Cox and Hinkley (1974, Chap. 3), which is a good reference at a more advanced level. An excellent overview is given in Cox (1977). For a criticism of p-values see Schervish (1996) as well as Matthews (1998). See also: Distributions, Statistical: Approximations; Frequentist Inference; Goodness of Fit: Overview; Hypothesis Testing in Statistics; Likelihood in Statistics; Meta-analysis: Overview; Resampling Methods of Estimation
Bibliography Agresti A 1992 A survey of exact inference for contingency tables. Statistical Science 7: 131–53 Barndorff-Nielsen O E, Cox D R 1994 Inference and Asymptotics. Chapman and Hall, London Berger J O, Sellke T 1987 Testing a point null hypothesis: The irreconcilability of p-values and evidence. Journal of the American Statistical Association 82: 112–22 Cox D R 1977 The role of significance tests. Scandinaian Journal of Statistics 4: 49–70 Cox D R, Hinkley D V 1974 Theoretical Statistics. Chapman and Hall, London Dawid A P, Dickey J M 1977 Properties of diagnostic data distributions. Journal of the American Statistical Association 72: 845–50 Fisher R A 1935 The Design of Experiments. Oliver and Boyd, Edinburgh, UK Kalbfleisch J G 1979 Probability and Statistical Inference. Springer, New York, Vol. 2 Kass R E, Wasserman L 1996 Formal rules for selecting prior distributions: A review and annotated bibliography. Journal of the American Statistical Association 91: 1343–70 Lehmann E L, Casella G 1998 Theory of Point Estimation. Springer, New York Matthews R 1998 The great health hoax. Sunday Telegraph 13 September. Reprinted at ourworld.compuserve.com\ homepages\rajm\ Schervish M J 1996 P values: what they are and what they are not. American Statistician 96: 203–6
N. Reid
Simmel, Georg (1858–1918) Georg Simmel was born in the heart of Berlin on March 1, 1858. He was the youngest son of Flora (born Bodstein) and Eduard Simmel, who, although coming from Jewish families, had been baptized into Christianity (he as a Catholic, she as a Protestant). Following the early death of Simmel’s father in 1874, the family suffered serious financial difficulties, which,
where the young Georg was concerned, were overcome thanks to Julius Friedla$ nder, a friend of the family (co-founder of the music publishing company ‘Peters’). Friedla$ nder felt a strong sympathy for the young Simmel, and indeed took him under his wing as his prote! ge! . Thus, Georg could attend high school and then university in Berlin, where he studied philosophy, history, art history, and social psychology (VoW lkerpsychologie). Simmel received his degree as Doctor of Philosophy in 1881, but not without difficulty: his first attempt at a doctoral thesis, ‘Psychological and Ethnological Studies on the Origins of Music,’ was not accepted, and he had instead to submit his previous work on Kant, On the Essence of Matter—Das Wesen der Materie nach Kant’s Physischer Monadologie, which had earned him a prize (Ko$ hnke 1996). The Habilitation (postdoctoral qualification to lecture) came next in 1885, for which he also encountered some controversy, and after this his academic career began immediately as a Priatdozent (external lecturer). An extraordinary (außerordentliche) professorship without salary followed in 1901, and indeed he had to wait until 1914 before he was offered a regular professorship at the University of Strasbourg, where he remained until his death on September 26, 1918, shortly before the end of the First World War. During his lifetime, Simmel was a well-known figure in Berlin’s cultural world. He did not restrict himself merely to scientific or academic matters, but consistently showed great interest in the politics of his time, including contemporary social problems and the world of the arts. He sought to be in the presence of, and in contact with, the intellectuals and artists of his day. He married the painter Gertrud Kinel, with whom he had a son, Hans, and maintained friendships with Rainer Maria Rilke, Stefan George, and Auguste Rodin, amongst many others. At his home in Berlin, he organized private meetings and seminars (Simmel’s priatissimo), whose participants he would choose personally.
1. Georg Simmel and the Social Sciences Simmel’s contributions to the social sciences are immeasurable. Nevertheless, most of them remain misunderstood, or have been separated from the intentions of their creator, and, thus, their origins have been forgotten. From system theory to symbolic interactionism, almost all sociological theories need to rediscover Simmel as one of their main founding parents. Simmel’s interest in the social sciences, especially in sociology, can be traced to the very beginning of his academic career. After attending seminars offered by Moritz Lazarus and Heymann Steinthal (founders of theVoW lkerpsychologie) during his student years, he became a member of Gustav Schmoller’s circle, where he 14091
Simmel, Georg (1858–1918) became acquainted with the debates concerning the national economy of the time, and for whom he held a conference on The Psychology of Money (GSG2 1989, pp. 49–65), which would later constitute the first pillar of one of his major works, The Philosophy of Money (Frisby and Ko$ hnke 1989). In both of these circles Simmel became more aware of, and sensitized towards, social questions. Schmoller’s engagement with social questions, together with Lazarus’ and Steinthal’s emphasis on the level of ‘Uq berindiidualitaW t’ (supraindividuality), and their relativistic worldview and insistence that ethical principles are not of universal validity, as well as Simmel’s own interest in Spencer’s social theory, all helped to shape the contours of his sociological approach. These various influences were melded together with Simmel’s philosophical orientation, particularly his interest in Kant, which yielded a new and rather far-reaching sociological perspective that would evolve extensively throughout his life. For example, Simmel’s first sociological work, On Social Differentiation (Uq ber Sociale Differenzierung: GSG2 1989), written at the very beginning of his academic career, was deeply influenced by Herbert Spencer and the ideas of Gustav Schmoller and his circle. Slowly, however, as the 1890s drew on, the admiration Simmel had felt for Spencer’s theories turned into rejection, and he distanced himself from an evolutionary-organicist approach to sociology. In this rejection of such theories, he brought his knowledge of Kant to his sociological thinking, and this would become the basis of his later contact and dialogue with the members of the southern-German neo-Kantian school, such as Rickert, Windelband, and with the sociologist who was closest to their ideas: Max Weber. This contact with neo-Kantianism influenced Simmel’s approach to the social sciences at the end of the nineteenth century and during the first years of the 1900s. From contemporary testimonies we know that Simmel was in fact one of the greatest lecturers at the University of Berlin, and that his seminars were attended by a great number of students (Gassen and Landmann 1993). It is thus difficult to understand why Simmel did not have a more successful academic career. We know from his correspondences and from Gassen and Landmann’s attempts to reconstruct Simmel’s life and oeure that Georg Jellinek engaged himself, though with no positive results, in seeking to obtain an ordinary professorship for Simmel at the University of Heidelberg in 1908 (with, following Jellinek’s death, a further attempt by Alfred Weber in 1912, and another in 1915). His converted Jewish background (i.e., assimilated into German society) surely played a significant role in the lack of recognition Simmel received from the academic system; but also his peculiar and original understanding of science, which diverged greatly from established patterns, as well as his characteristic mode of writing, 14092
using essays instead of more ‘academic’ and standardized forms, contributed to his not being accepted into the rather formal and classical German academic milieu. Both attempts at obtaining a professorship for Simmel were blocked by the bureaucracy of the Grand Dukedom of Baden. In fact the letter of evaluation written by the Berlin historian Scha$ fer regarding a possible professorship for Simmel in Heidelberg in 1908 has as the primary argument for denying Simmel’s potential ability to be a good professor his ‘Jewishness,’ which, according to Scha$ fer, too obviously tinged Simmel’s character and intellectual efforts with a strong relativism and negativity, which could not be good for any student (Gassen and Landmann 1993, Ko$ hnke 1996). Throughout his life Simmel had to fight against these kinds of accusation, and he endeavored to build a ‘positive relativism’ in an attempt to show that he did not question ‘absolute pillars’ and thus leave us with nothing, but sought instead to show that this sense of ‘absoluteness’ was also a product of human reciprocal actions and effects (Wechselwirkungen), and was not fundamentally absolute. Such an argument was too much for the society and scientific milieu of the time to accept and forgive, even when Simmel later radically rejected his Introduction to the Moral Science (Einleitung in die Moralwissenschaften: GSG3 1989 GSG4, 1991), referring to it as a sin of his youth. This was the work in which his relativism, in the critical, ‘negative’ sense, had been most fully in bloom (Ko$ hnke 1996). Notwithstanding these varying approaches to sociology, Simmel’s interest in the discipline persisted right through his career. When Simmel, in his letter to Ce! lestin Bougle! of March 2, 1908, wrote: ‘At the moment I am occupied with printing my Sociology, which has finally come to its end,’ and, sentences later, added that work on this book ‘had dragged on for fifteen years’ (Simmel Archive at the University of Bielefeld) he indicated that his engagement with sociology was a long-term project. Considering that Sociology (Sociologie)was first published in 1908 (and therefore his work on it must have begun around 1893), we can find its first seed in his article The Problem of Sociology (Das Problem der Sociologie) originally published in 1894. Simmel must have thought this article a significant contribution to sociology, since he endeavored to spread it abroad as much as possible. Hence, the French translation of The Problem of Sociology appeared, simultaneously with the original German version, in September 1894. The American translation appeared in the Annals of the American Academy of Political and Social Science a year later, and, by the end of the century, the Italian and Russian translations were also in print. The American translation is of particular significance, since Simmel emphasized therein, in a footnote, that sociology involved an empirical basis and research, and should not be thought of as an independent offshoot from philosophy, but as a science concerning
Simmel, Georg (1858–1918) the social problems of the nineteenth century. In the Italian translation he delivered an updated version of the text, wherein he introduced clear references to his theoretical polemic with Emile Durkheim. From his letter to Ce! lestin Bougle! of February 15, 1894, we know that Simmel was, after the completion of The Problem of Sociology, quite excited by this new discipline, and that he did not foresee a shift away to any other fields of inquiry in his immediate future. During those years he worked on most of the key sociological areas of research, thus articulating the key social problems of his time within a sociological framework: for example, workers’ and women’s movements, religion, the family, prostitution, medicine, and ethics, amongst many others. He seemed deeply interested in putting his new theoretical proposals and framework for the constitution of sociology into practice. He realised the institutionalisation of the new discipline would be reinforced by the establishment of journals for the discipline; hence his participation in: the ‘Institut Internationale de Sociologie,’ for whom he became vice-president; the American Journal of Sociology, and, although only briefly (due to differences with Emile Durkheim), l’AnneT e sociologique (Rammstedt 1992, p. 4). He also played with the idea of creating his own sociological magazine. Another means of solidifying the role of sociology within the scientific sphere was through the academia, and he engaged himself in organising sociological seminars, offering them uninterruptedly from 1893 until his death in 1918. On June 15, 1898, he wrote to Jellinek ‘I am absolutely convinced that the problem, which I have presented in the Sociology, opens a new and important field of knowledge, and the teaching of the forms of sociation as such, in abstraction from their contents, truly represents a promising synthesis, a fruitful and immense task and understanding’ (Simmel Archive at the University of Bielefeld) Despite his original intention, it is clear Simmel did not work continuously on the Sociology for 15 years as, from 1897 to 1900, he worked almost exclusively on The Philosophy of Money (Philosophie des Geldes), and also found time for the writing and publication of his Kant (1904), as well as a reprint of his Introduction to the Moral Sciences (1904), which he had intended to rewrite, for he no longer accepted most of the ideas he had presented therein when it was first published in 1892 (although he did not achieve this); the revised edition of The Problems of the Philosophy of History (Probleme der Geschichtsphilosophie 1905), (The Philosophy of Fashion (Philosophie der Mode 1905), and Religion Die Religion 1906\1912) were all worked on by Simmel during this period too (Rammstedt 1992b). In The Problem of Sociology he questioned for the first time the lack of a theoretically well-defined object of study for the emerging discipline, and sought to develop a specific sociological approach, which would entail a distinct object of study, in order to bestow legitimacy and
scientific concreteness to a discipline, under attack from different, and more settled, lines of fire. Fearing his call had not been heard, Simmel endeavored to prove his point by writing a broader work, within which he attempted to put into practice the main guidelines he had suggested as being central to the newly emerging discipline. In this way the Sociology, almost one thousand pages long, was cobbled together from various bits and pieces, taken from several essays he had written between the publication of The Problem of Sociology in 1894 and its final completion in 1908. Simmel, as can be understood from his letters, was aware of the incompleteness of this work, but rescued it by saying that it was an attempt to realise that which he had suggested almost 15 years earlier, for it had not been noticed enough by the scientific community. Simmel did not wish, by that time, to be thought of as only a sociologist. This was due to sociology not being an established discipline within the academic world, which therefore did not allow him to obtain a professorship in the field (i.e., offering very little recognition). Nevertheless he never quite abandoned the field of sociology and continued to write about religion, women’s issues, and the family. He merely broadened his scope to include Lebensphilosophie (the philosophy of life) and cultural studies in general, whilst participating at the same time in the founding of the German Sociological Society (Deutsche Gesellschaft fuW r Soziologie), for whom he served as one of the chairmen until 1913. Sociology marks the end of Simmel’s strongly Kantian period, and represents a turning point in his interests, for, from this moment onwards, he refused to be labeled as a sociologist, seeking instead to return to philosophy. Indeed, following the publication of the ‘big’ Sociology, as it is usually called by those who work in Simmel studies, he did not publish any sociological papers for nine years (although he continued, as has already been mentioned, to offer sociological seminars until his death), instead devoting his efforts to philosophy, history and the philosophy of art. Hence, as part of Simmel’s output from these years we find, amongst other works, Kant und Goethe (Kant und Goethe 1906\1916), Goethe (1913), Kant (1904\1913\1918), The Principal Problems of Philosophy (Hauptprobleme der Philosophie 1911), The Philosophical Culture (Philosophische Kultur 1911\1918), and Rembrandt (1916). This new direction was accompanied and partly motivated by Simmel’s acknowledgment of Henri Bergson’s hoeure, and his inclination towards Lebensphilosophie (the philosophy of life; see Fitzi 1999, GSG16 1999). Thus, ‘life’ became the primary focus of Simmel’s theoretical work, and consequently ‘society’ was pushed into a secondary role. During these years Simmel occupied himself with the study of artists, of their production, and how the relation between life and its formal expression is crystallised in their work. It was as if Simmel had lost interest in sociology, as he did not write a single line 14093
Simmel, Georg (1858–1918) concerning it for years. Yet, unexpectedly in 1917, a year before his death, he wrote Grundfragen der Soziologie (Main Questions of Sociology), the ‘small sociology,’ as it is called by Simmel scholars (i.e., in contrast to his 1908 work). The impetus for writing this book came from a publisher (Sammlung Go$ schen), who intended to print an introductory work to sociology, which he invited Simmel to write, because of the success other works of his had enjoyed for the same publisher. If Simmel had indeed distanced himself from all sociological questions, it is likely he would have merely reached for the shelf to his previous works and rewritten them in shorter form. But this was not the case, because, although he utilised older material, Simmel rewrote and redefined his perspective, and in the Main Questions of Sociology, scarcely 100 pages long, presented the final stage of his sociological reflections, which melded together his previous study of forms of sociation with a perspective from the philosophy of life. This approach to sociology fell into neglect after his death, awaiting revitalisation, harbouring a broad scope and original perspective for new generations of sociologists to rediscover (Simmel’s last work was a philosophical contribution to Lebensphilosophie, the Lebensanschauung—View of Life—1918, GSG16 1999).
objects). However, Simmel argued in his Main Questions of Sociology (GSG16 1999, pp. 62–8) that if we merely take individuals and pretend, through adopting a significant distance, to approach ‘society,’ or a social phenomenon, we will not reach our goal. Therefore, when conducting a sociological inquiry, what we must seek to do is base ourselves on the forms of sociation, which are built in Wechselwirkung. Simmel used this concept when either addressing ‘interactions’ or following the Kantian definition as ‘reciprocal actions and effects’ (Simmel 1978). The difference between the two possible meanings of this concept, or, more precisely, the differentiation of the two different concepts hidden behind the same word, has been addressed in the English translation of The Philosophy of Money; so the usual mistake of translating Wechselwirkung as a direct synonym for interaction has been corrected. According to Simmel, what actually takes place between individuals will be seen to be that which constitutes the object of sociology, not merely the individuals by themselves, or society as a whole, for, as previously mentioned, society is nothing but the sum of forms of sociation, a continuous process in and for which these forms intertwine and combine themselves to form the whole (GSG11 1992, p. 19). Thus, Simmel defined society as the sum of the forms of reciprocal actions and effects.
2. The Object of Sociology At a time when sociology was still far from being an established discipline, instead seeking to stand up and open its eyes for the first time, Simmel sought to release it from the burden of being ‘the science of society.’ According to him this burden was an impossible one for the new-born discipline to carry, since being the science of society meant having to compete with already settled and established disciplines for the legitimacy of its object of study; law and history, psychology and ethnology, all could argue the case that society was their object, thus leaving sociology with the mere pretence of including elements from them all. Viewed from this perspective, as an object society was an all-encompassing matter but, at the same time, it eluded any scientific investigation, just like sand falling through our fingers. Simmel, as Max Weber two decades later, contributed to the demystification of ‘society’ as some kind of essential entity (as it appeared in nineteenth-century sociology, and remained as such in the sociology of Durkheim and To$ nnies), instead explaining it as a dynamic process, a continuous happening, a continuous becoming, which is nothing more than the mere sum of the existing forms of sociation. Individuals as well as society\ies are not units in and of themselves, though they may appear as selfsufficient units depending on the distance the observer interposes between him\herself and them (as observed 14094
3. The Concept of ‘Form’ If we take reciprocal actions and effects as our starting point, we achieve only a perspective common to all social and human sciences: sociology as method. In order to construct sociology as an independent discipline, with a specific object of study, it is necessary to analytically differentiate between form and content, that is, ‘forms’ as means and patterns of interaction between individuals, social groups or institutions, and ‘contents’ as that which leads us to act, the emotions or goals of human beings. Thus, social forms were conceived as being the object of analysis for a scientific sociology, which would only be possible empirically. Contents, on the other hand, should be left to other disciplines, such as psychology or history, to analyze. Simmel’s sociological theory is orientated towards these ‘forms of sociation’ (defined as such in the subtitle of Sociology). For individuals to become social, they need to rely on such forms in order to channel their contents, as forms represent their only means of participation in social interaction. Forms are independent from contents, and when analyzing them, it is necessary that they are abstracted from individual, particular participation in concrete interactions: the question of social forms does not include that regarding the specific relationships between the participants involved in concrete interactions; we are dealing only with that which is between them, as for an
Simmel, Georg (1858–1918) analysis of social forms the particular individuals involved in them are irrelevant. Human beings, with their particular ‘contents,’ only become social when they seek to realise these contents, and then acknowledge this is only possible in a social framework: via ‘exteriorising,’ via acting through forms. Hence forms are social objectivations, which impose themselves, with their norms, upon particular individuals, an imposition which could only be annulled by isolation. These impositions and constraints are part of what sociation is, a concept which plays a key role in Simmel’s theory, for he actually maintained that human beings are not social beings by nature. Simmel placed sociation, as a product of reciprocal actions and effects, at the centre of his formal, or pure (reine), sociology. The form of competition can be used as an example of what Simmel actually meant by ‘forms.’ Competition does not imply any specific contents, and remains the same, independent of those who are competing, or what they are competing for. Competition forms ‘contents’ (such as job seeking, or looking for the attention of a beloved person, amongst many others), limits the boundaries of actions, and actually brings them into being via giving them a framework and shape in which they can appear in the arena of actions and effects. This situation of having the choice between multiple forms for channelling one content is stressed appreciatively in Simmel’s sociology, particularly in his Lebensphilosophie, when he no longer contrasted the concept of form with the concept of content, but with life. In this later period, forms are the crystallization of the unretainable flow of life, yet also the channels of expressing life, for life cannot be expressed as only itself. Simmel explained this apparent paradox by asserting that life is ‘more-life’ and also ‘more-thanlife.’ The concept of ‘more-life’ merely implies that this continuous flow connects every complete moment with the next; ‘more-than-life’ implies that life is not life if it does not transcend its boundaries, and becomes crystallized in a form. So life becomes art, or science, for example; life becomes externalised and crystallised and hence expressed and fulfilled. For instance, we should understand art as one form of expressing (aW ussern) life through (artistic, aesthetic) forms. Actually uncovering how this perspective can be applied to sociology may appear to be a difficult task; the first step towards clarifying this was made by Simmel himself in his Main Questions of Sociology.
4. ‘Me’ and ‘You’ Deliberately distancing himself from the concept of ‘alter ego,’ Simmel emphasized the signific1ance the concept of ‘you’ should have to all sociological theory. He articulated and embedded this concept within his—sociological—theory of knowledge. In parallel with Kant’s question on nature, Simmel formulated this question as ‘How is society possible?’ the first and
well known digression of his Sociology. In order to answer this question he proposed three a priori. According to the first, ‘Me’ and ‘You’ would see each other as ‘to some extent generalised’ (GSG11 1992, p. 47), and he therefore assumed that each individual’s perception of the other (i.e., ‘you’) would also be generalized to a certain degree. The second a priori affirms that each individual as an ‘element of a group is not only a part of the society, but, on top of this, also something else’ (GSG11 1992, p. 51) that derives its own uniqueness, its individuality from the duality of being\not being sociated. The third a priori is related to the assignment of each individual to a position within their society (in the sense of the sum of highly differentiated reciprocal actions and effects): ‘that each individual is assigned, according to his qualities, to a particular position within his social milieu: that this position, which is ideally suited to him, actually exists within the social whole—this is the supposition according to which every individual lives his social life’ (GSG11 1992, p. 59); this is what Simmel meant when he wrote about the ‘general value of individuality.’ These sociological a priori, orientated towards role, type, and individual\society, also allow us to understand the central aspects of Simmel’s methodology, which can be framed together with the concepts of ‘differentiation,’ in the sense of division and difference, and of ‘dualism,’ in the sense of an irreconcilable, tragic opposition of the elements of a social whole.
5. A Proposal for Three Sociologies According to Simmel sociology needed to stand as an autonomous science in close relationship with ‘the principal problem areas’ (GSG16 1999, pp. 76–84) of social life, even when these areas were distant from each other. Already in 1895 he asserted that just which name to give to any particular group was quite unimportant, since the real question was to state problems and to solve them, and not at all to discuss the names which we should give to any particular groups (1895, Annals of the American Academy of Political and Social Science 6: 420). He stressed this orientation into problem areas again in 1917, when he returned to theorizing on the Main Questions of Sociology. Thus, as an attempt to approach these main questions (i.e., which relationships exist between society and its elements, individuals) he concentrated on three different sets of problems with concrete examples, particularly in the third part of the first chapter.The first focuses on objectivity as a component of the social sphere of experience; the second includes the actual facts of life, in which, and through which, social groups are realised; and the third emphasises the significance of society (Fitzi and GSG16 1999, p. 84) for grasping and understanding attitudes towards the world and life. These three problem areas correspond to his three proposed sociologies: the ‘general sociology,’ which, wherever society exists, deals with the 14095
Simmel, Georg (1858–1918) central relationships between the individuals and the social constructions resulting from them. The aim of general sociology is to show which reciprocally orientated values these social constructions and individuals possess. He gave an example for general sociology in the second chapter of the Main Questions of Sociology, by illustrating the relationships which exist between the ‘social and the individual level,’ a perspective which, at the same time, sociologizes the masses theory. The ‘pure or formal’ sociology focuses upon the multiple forms individuals use to embody contents, that is, emotions, impulses, and goals, in reciprocal actions and effects with others, be it with other people, social groups, or social organizations, and thus they constitute society. According to him, each form wins, entwined in these social processes, ‘an own life, a performance, which is free from all roots in contents’ (GSG16 1999, p. 106). The example Simmel chose for his formal sociology was the form of ‘sociability’ (Geselligkeit), which he presented in the third chapter of the Main Questions of Sociology. Finally, the ‘philosophical sociology’ circles the boundary of the ‘exact, orientated towards the immediate understanding of the factual’ (GSG16 1999, p. 84) empirical sociology: on the one hand the theory of knowledge (Erkenntnistheorie), on the other, the attempts to complement, through hypothesis and speculation, the unavoidably fragmentary character of factual, empirical phenomena (Empirie), in order to build a complete whole (GSG16 1999, p. 85). In the fourth chapter, entitled ‘The Individual and Society in Eighteenth and Nineteenth Century Views of Life’, Simmel illustrated the abstract necessity of individual freedom, which should be understood as a reaction to the contemporary, increasing, social constraints and obligations. Thus, reference is made to general conflicts between individual and society, which derive from the irreconcilable conflict between the idea of society as a whole, which ‘requires from its elements the onesidedness of a partial function,’ and the individual, who knows him\herself to be only partially socialized, and ‘him\herself who wants to be whole’ (GSG16 1999, p. 123).
6. Simmel and Modern Sociology Simmel’s theoretical significance to contemporary sociology resides in the various theories, which built on his sociology. As examples, it will be sufficient to name symbolic interactionism, conflict theory, functionalism, the sociology of small groups, and theories of modernity. Simmel also introduced into sociology the essay as an academic form of analysis, whilst his digressions within Sociology should be mentioned as well, for they have become generally recognized as classical texts; see, for example, his digressions on ‘The Letter’ (Brief ), ‘Faithfulness’ (Treue), ‘Gratefulness’ (Dankbarkeit), and ‘The Stranger’ (Fremde). But above all sociology owes Simmel the freedom he gave 14096
it from the fixation on the ‘individual and society’ as an ontic object—in hindsight, a point of no return. See also: Capitalism; Cities: Capital, Global, and World; Cities: Internal Structure; Differentiation: Social; Ethics and Values; Family, Anthropology of; Family as Institution; Fashion, Sociology of; Feminist Movements; Gender and Feminist Studies in Political Science; Groups, Sociology of; History and the Social Sciences; Individual\Society: History of the Concept; Individualism versus Collectivism: Philosophical Aspects; Interactionism: Symbolic; Kantian Ethics and Politics; Knowledge (Explicit and Implicit): Philosophical Aspects; Knowledge Representation; Knowledge, Sociology of; Labor Movements and Gender; Labor Movements, History of; Medicine, History of; Methodological Individualism in Sociology; Methodological Individualism: Philosophical Aspects; Modernity; Modernity: History of the Concept; Modernization and Modernity in History; Modernization, Sociological Theories of; Money, Sociology of; Motivational Development, Systems Theory of; Personality and Social Behavior; Personality Structure; Personality Theories; Prostitution; Religion, Sociology of; Science and Religion; Selfknowledge: Philosophical Aspects; Social Movements and Gender; Social Movements, History of: General; Sociology, Epistemology of; Sociology, History of; Sociology: Overview; Symbolic Interaction: Methodology; Theory: Sociological; Urban Life and Health; Urban Sociology
Bibliography Dahme H J, Rammstedt O (eds.) 1983 Georg Simmel. Schriften zur Soziologie. Eine Auswahl, Suhrkamp, Frankfurt am Main, Germany Fitzi G 1999 Henri Bergson und Georg Simmel: Ein Dialog zwischen Leben und krieg. Die persoW nliche Beziehung und der wissenschaftliche Austausch zweier Intellektuellen im deutschfranzoW sischen Kontext or dem Ersten Weltkrieg. Doctoral thesis, University of Bielefeld Gassen K Landmann M 1993 Buch des Dankes an Georg Simmel. Briefe, Erinnerungen, Bibliographie. Zu seinem 100. Geburtstag am 1. MaW rz 1958. Duncker & Humblot, Berlin GSG 10: Philosophie der Mode (1905). Die Religion (1906\1912). Kant und Goethe (1906\1916). Schopenhauer und Nietzsche (1907), ed. M Behr, V Krech, and G Schmidt, 1995 GSG 8: AufsaW tze und Abhandlungen 1901–1908, Band II, ed. A Cavalli and V Krech 1993 GSG 2: AufsaW tze 1887–1890. Uq ber sociale Differenzierung (1890). Die Probleme der Geschichtsphilosophie (1892), ed. H J Dahme, 1989 GSG 5: AufsaW tze und Abhandlungen 1894–1900, ed. H J Dahme and D Frisby, 1992 GSG 16: Der Krieg und die geistigen Entscheidungen (1917). Grundfragen der Soziologie (1917). Vom Wesen des historischen Verstehens (1918). Der Konflikt der modernen Kulur (1918). Lebsensanschauung (1918), ed. G Fitzi and O Rammstedt, 1999
Simulation and Training in Work Settings GSG 6: Philosophie des Geldes (1900\1907), ed. D Frisby and k C Ko$ hnke, 1989 GSG 3: Einleitung in die Moralwissenschaft. Eine Kritik der ethischen Grundbegriffe. Erster Band (1892\1904), ed. K C Ko$ hnke, 1989 GSG 4: Einleitung in die Moralwissenschaft. Eine Kritik der ethischen Grundbegriffe. Zweiter Band (1893), ed. K C Ko$ hnke, 1991 GSG 1: Das Wesen der Materie (1881). Abhandlungen 1882–1884. Rezensionen 1883–1901, ed. K C Ko$ hnke, 1999 GSG 7: AufsaW tze und Abhandlungen 1901–1908, Band I, ed. R Kramme, A Rammstedt, and O Rammstedt, 1995 GSG 14: Hauptpobleme der Philosophie (1910\1927). Philosophische Kultur (1911\1918), ed. R Kramme, and O Rammstedt, 1996 GSG 9: Kant (1904\1913\1918). Die Probleme der Geschichtsphilosophie, 2 Fassung (1905\1907), ed. G Oakes, and K Ro$ ttgers, 1997 GSG 11: Soziologie. Untersuchungen uW ber die Formen der Vergesellschaftung (1908), ed. O Rammstedt, 1992 Ko$ hnke K C 1996 Der junge Simmel in Theoriebeziehungen und sozialen Bewegungen, Suhrkamp, frankfurt am Main, Germany Otthein R (ed.) 1992a Georg Simmel Gesamtausgabe. Suhrkamp, Frankfurt, Germany Rammstedt O 1992 Programm und Voraussetzungen der Soziologie Simmels, Simmel-Newsletter 2: 3–21 Simmel G 1978 The Philosophy of Money, trans. and ed. by D Frisby and T Bottomore, Routledge, London Simmel G 1895 The Problem of Sociology. Annals of the American Academy of Political and Social Science 6
O. Rammstedt and N. Canto! -Mila'
Simulation and Training in Work Settings Training is a set of activities planned to bring about learning in order to achieve goals often established using a training needs analysis. Methods of training include: behavior modeling, based on video or real activities; action training, which provides an opportunity to learn from errors; rule- and exemplar-based approaches that differ in terms of the extent to which theory is covered; and methods that aim to develop learning skills, the skills needed to learn in new situations. Simulation involves using a computerbased or other form of model for training, providing an opportunity for systematic exposure to a variety of experiences. Simulations are often used in training complex skills such as flying or command and control. Issues in training highlight the differing emphasis that can be placed on goals related to the process of learning or performance outcomes; motivation and self-management in training; and the transfer dilemma where effortful methods of training that enhance transfer may be less motivational. Finally organizational issues and training evaluation are discussed with a particular emphasis on the role of the environment and follow-up enhancing transfer of skills learned during training to the workplace.
1. Oeriew Training is a set of planned activities organised to bring about learning needed to achieve organizational goals. Organizations undertake training as part of the socialization of newcomers. Even with the best selection systems a gap frequently remains between the knowledge, skills, and abilities (KSAs) required in a job, and those possessed by the individual. Training is used by organizations to improve the fit between what an individual has to offer and what is required. Training is also used when job requirements change such as with the introduction of new technology. Because of the frequency with which job requirements change, a major aim of training is to develop specific skills while also ensuring their transferability and adaptability. An ultimate aim is to increase learning skills where the emphasis in training is on learning to learn. Many methods of training are available (e.g. behavioral modeling, action, computer-based training), with an increasing trend toward incorporating evaluation as part of the training. Although simulations have been used in training for decades, the reduced cost and increased sophistication of simulations now make them more readily available for a wide range of situations. Simulations provide increased opportunities to train for transferable and adaptable skills because trainees can experiment, make errors, and learn from feedback on complex and dynamic real-time tasks. Evaluation of training programmes remains an important component of the training plan, and can be used to enhance transfer.
2. Training Needs Analysis Most training starts with a training needs analysis, where the present and future tasks and jobs that people do are analyzed to determine the task and job requirements. The identified requirements are then compared with what the individual has to offer. This is done within a broader organizational context. Many different methods of job, task, and organizational analysis can be used, including observation, questionnaires, key people consultation, interviews, group discussion, using records, and work samples (that is, letting people perform a certain activity and checking what they need to know to perform well). A transfer of training needs analysis (Hesketh 1997a), places a particular emphasis on identifying the cognitive processes that must be practised during learning to ensure that the skills can be transferred to contexts and tasks beyond the learning environment.
3. The Training Plan and Methods There has been an increase in the methods of training that can be used within a broader training plan that includes transfer and evaluation (Quinones and Ehrenstein 1997). The training plan provides a way of 14097
Simulation and Training in Work Settings combining the illustrative methods discussed below to ensure that the training needs identified in the analysis can be met.
3.1 Behaior Modeling Behavior modeling is often combined with roleplaying in training. A model is presented on video or in real life, and the rationale for the special behaviors of the model are discussed with the trainees. Trainees then role play the action or interaction, and receive feedback from the trainers and fellow trainees. This type of training is very effective as it provides an opportunity for practice with feedback (e.g. Latham and Saari 1979).
3.2 Action Training Action training follows from action theory (Frese and Zapf 1994) and exploratory learning. Key aspects of action training involve active learning and exploration, often while doing a task. Action learning is particularly effective as a method of training (Smith et al. 1997). Another important aspect of action training involves obtaining a good mental model of the task and how it should be approached. A mental model is an abstraction or representation of the task or function. Trainees can be helped to acquire a mental model through the use of ‘orientation posters’ or advanced organizers, or through the provision of heuristic rules (rules of thumb) (Volpert et al. 1984). One of the advantages of action training is the opportunity to learn from feedback and errors. Feedback is particularly important in the early stages of learning, but fading the feedback at later stages of learning helps ensure that trainees develop their own self-assessment skills (Schmidt and Bjork 1992). Errors are central in action training since systematic exposure to errors during learning provides opportunities to correct faulty mental models while providing direct negative feedback. Although earlier learning theory approaches argued that there should be only positive feedback, active error training helps trainees develop a positive attitude toward errors because of their value in learning (Frese et al. 1991).
3.3 Rules ersus Examples in Training Although it has traditionally been assumed that rulebased training provides a sound basis for longer term transfer, recent research suggests that for complex nonlinear problems, exemplar training may be superior. Optimizing the combination of rules and examples may be critical. In order to facilitate transfer, 14098
examples should be chosen carefully to cover the typical areas of the problem. With only a few examples, the rule and the example are often confused. However, individual instances are difficult to recall if trainees are provided with too many examples (DeLosh et al. 1997), suggesting that care needs to be taken when deciding how many examples will be presented in training.
3.4 Deeloping Learning Skills: Learning to Learn Downs and Perry (1984) offer a practical way of helping trainees develop learning skills such as knowing that there are different ways of learning (e.g., learning by Memorizing, Understanding and Doing, the MUD categories). Facts need to be memorized, concepts need to be understood and tasks such as driving a motor car need to be learned by doing. The approach stresses the importance of selecting the most appropriate method for the material to be learned so that learning skills can be trained while also developing content skills. Methods that typically encourage learning skills as well as content skills involve action learning, active questioning and discovery learning, rather than direct lecturing and instruction. The ideas in the learning to learn literature lead to issues such as developing self-management techniques and focusing on learning.
3.5 Simulation Training Simulation involves developing a model of an on-thejob situation that can be used for training and other purposes. The advantages of using simulation for training include reduced cost, opportunity to learn from errors, and the potential to reduce complexity during the early stages of learning. Historically simulators have been used for training in military, industrial and transport industries, and in management training where business games are widespread. Pilot training on simulators is well developed, to the extent that many pilots can transfer directly from a flight simulator to a real aircraft (Salas et al. 1998). Within the aviation industry, crew cockpit resource management training has also been undertaken using simulators. Driving simulators are not as well developed, and the transfer of skills learned on a driving simulator to the actual road remains to be established. Nevertheless, there is a view that attitudes and driving decisions can be trained on a simulator. Simulations of management decision situations and command and control in the military, police, and emergency personnel are well developed, and widely used for training. These simulations provide a miniaturized version of real crises, with key decision points extracted and shrunk
Simulation and Training in Work Settings in time. They provide an opportunity for the trainee to experiment with decisions, make mistakes, and learn from these errors (Alluisi 1991). Simulation training has also been used to facilitate the acquisition of crosscultural skills.
4. Issues in Training 4.1 Learning s. Performance Goals Whether the emphasis during learning is on performance or learning goals is used to explain differences in how people conceptualize their ability. Dweck and Leggett (1988) argue that some people conceptualize ability as increasing with learning (learning orientation), while others see ability as fixed (performance orientation). People with a learning orientation learn from mistakes and challenges. However, individuals with a performance orientation view mistakes as examples of poor performance and learn less from errors. Performance oriented people also tend to demonstrate a helpless response to problems and are therefore less likely to overcome challenges. Martocchio (1994) showed that ability self-conceptualization was related to computer anxiety and selfefficacy. Motivation plays a key role in transfer of training (Baldwin and Ford 1988). Numerous motivation issues have been studied, the most important ones being self-efficacy, relapse prevention, perceived payoffs, goals, and the training contract. Self-efficacy is critical to transfer in that people will use a skill only if they believe that they can actually perform the appropriate behavior. Relapse prevention focuses on teaching solutions to those situations in which it may prove difficult to use the newly learned skills (Marx 1982). Trainees who receive relapse prevention training have been found to use their skills more often and perform their job better.
4.2 Self-management in Training Self-management implies that one acquires the skills to deal with difficulties, reward oneself, and increase self-efficacy. Metacognitive strategies showing evidence of self-management and self-reward during training related to training performance. Furthermore, self-efficacy is related to increased post-training knowledge (Martocchio 1994) and transfer performance. Thus, self-efficacy functions as a predictor of both training and transfer performance.
4.3 Transfer Dilemma Transfer is important to ensure that the skills learned in one context or on one task can be applied in a range
of different contexts and tasks. For example fire fighters may learn how to fight fires in urban areas, but their skills must also transfer to the bush or rural areas where they are frequently required to work in emergencies. In the field of technology, training, in the use of one spreadsheet or database should lead to transfer to different spreadsheets and databases and a range of other software packages. Baldwin and Ford (1988) provide a model that highlights several factors that influence transfer, including the similarity of the training and transfer situation, the nature of the training methods used, and the extent to which environmental factors reinforce transfer. Annett and Sparrow (1986) explained that transfer would be best if the stimulus situations and behaviors trained shared identical elements with stimuli in the work environment and the behaviors required there. Because these are seldom identical it is important to use methods of training that encourage learners to bridge the gap and transfer their skills (Hesketh 1997b), and to create an environment that reinforces them for doing so. A transfer of training needs analysis can be used to discover how best to design training to increase transfer knowledge and skill (Hesketh 1997a). For example, a transfer of training needs analysis for a team leader in fire fighting would highlight the types of decisions that needed to be made in different fire incidents and the specific cues used in making the decisions. This information would be used to design training that emphasized practice of these decision processes in a range of systematically chosen contexts to facilitate transfer. This approach ensures that the cognitive skills required for transfer are practised during training, and that knowledge about transfer and likely barriers are discovered during training (Von Papstein and Frese 1988). Druckman and Bjork (1992) has highlighted a dilemma in that the methods of training that trainees enjoy, and that often lead to better performance on the training task, are not necessarily the ones that enhance long term retention and transfer. Trainees enjoy methods of training that require less effortful cognitive processing. Yet to facilitate transfer, trainees need to engage in the problem solving and active processes that they will be required to do on transfer. This may create motivational difficulties. Designing the training to provide an appropriate level of challenge is important for both motivation and transfer. 4.4 Simulation and Transfer Early debate about the appropriate level of physical and psychological fidelity of simulation for training remains important, but has been incorporated into the more general issue of transfer and generalization. A high fidelity simulation may facilitate transfer to a single context, but lower fidelity simulators may be more appropriate if transfer is required to a range of situations. Current research issues are addressing the 14099
Simulation and Training in Work Settings best ways of integrating simulation with other forms of training, and how to optimize the level of fidelity for the particular purpose and type of simulation. The debate is being informed by research on the best way of combining rules and examples for training. Simulators provide an ideal opportunity to structure systematic exposure to a carefully chosen set of exemplar situations. Simulators have traditionally been used for training, but their potential use is much more widespread. Simulation may provide a more realistic selection task for situations that require dynamic decision-making. Simulations are also being used as a way of signing off competency levels. Questions remain about how to deal with re-testing with a simulator, e.g., whether the trainee should be given an opportunity to perform again on exactly the same sequence as used during an initial test or a transfer problem. Here the research on transfer of training is critical, and may be of use in resolving the reassessment debate. This illustrates the ways in which selection, assessment, and training are strongly related areas of research and practice. 4.5 Organizational Issues Organizational characteristics often influence transfer success. Trainees develop expectations about whether or not it will pay off when they use what they have learned in training. Often, companies teach one thing and reward a completely different behavior. For example, trainees may learn to be cooperative in a training course, but may then be paid for their individual contribution in a highly competitive environment. In such situations there is no transfer. Trainees need to be reinforced for what they have learned, and to practise skills in circumstances where errors can be made without serious consequences. For example a practice niche can be created where a bank clerk who has learned a new program to calculate mortgages is provided with an opportunity to practise it first while answering written requests. Thus, the customer does not see all the mistakes the bank clerk makes when using the new program. 4.6 Ealuating Training The importance of evaluation has always been emphasized in training, although until recently approaches were somewhat traditional and limited. The importance of training evaluation has been recognized because intuitive guesses about what works are often wrong. Druckman and Bjork (1991) concluded that many companies in the USA continue to use methods of training known to be suboptimal for transfer because they rely on short term reactions evaluation, rather than examining the longer term retention and transfer of skills. Training evaluation can also be used as a way of indicating the importance 14100
of the skills learned, and of providing an opportunity to practise. Integrating training evaluation into the training plan is the best way of achieving this. The more detailed understanding of ways in which knowledge structures change with skill acquisition has also provided a basis for evaluating training. For example, experts tend to have hierarchically organized knowledge structures, and are able to read off solutions to problems far more quickly. These ideas can be used to suggest innovative ways of evaluating training programs.
Bibliography Alluisi E A 1991 The development of technology for collective training: SIMNET, a case history. Human Factors 33: 343–62 Annett J, Sparrow J 1986 Transfer of training: A review of research and practical implications. Programmed Learning and Educational Technology (PLET) 22(2): 116–24 Baldwin T T, Ford J K 1988 Transfer of training: A review and directions for future research. Personnel Psychology 41: 63–105 DeLosh E L, Busemeyer J R, McDaniel M A 1997 Extrapolation: The sine qua non for abstraction in function learning. Journal of Experimental Psychology: Learning, Memory, and Cognition 23: 968–86 Downs S, Perry P 1984 Developing learning skills. Journal of European Industrial Training 8: 21–6 Druckman D, Bjork R A 1991 In the Mind’s Eye: Enhancing Human Performance. National Academy Press, Washington, DC Dweck C S, Leggett E L 1988 A social-cognitive approach to motivation and personality. Psychological Reiew 95: 256–73 Frese M, Brodbeck F C, Heinbokel T, Mooser C, Schleiffenbaum E, Thiemann P 1991 Errors in training computer skills: On the positive functions of errors. Human– Computer Interaction 6: 77–93 Frese M, Zapf D 1994 Action as the core of work psychology: A German approach. In: Triandis H C, Dunnette M D, Hough L M (eds.) Handbook of Industrial and Organizational Psychology, 2nd edn. Consulting Psychologists Press, Palo Alto, CA, Vol. 4, pp. 271–340 Hesketh B 1997a W(h)ither dilemmas in training for transfer. Applied Psychology: An International Reiew 46: 380–6 Hesketh B 1997b Dilemmas in training for transfer and retention. Applied Psychology: An International Reiew 46: 317–39 Latham G P, Saari L M 1979 Application of social-learning theory to training supervisors through behavioural modelling. Journal of Applied Psychology 64: 239–46 Martocchio J J 1994 Effects of conceptions of ability on anxiety, self-efficacy, and learning in training. Journal of Applied Psychology 79: 819–25 Marx R D 1982 Relapse prevention for managerial training: A model for maintenance of behavior change. Academy of Management Reiew 7: 433–41 Quinones M A, Ehrenstein A 1997 Training for a Rapidly Changing Workplace: Applications of Psychological Research. APA, Washington, DC Salas E, Bower C A, Rhodenizer L 1998 It is not how much you have but how you use it: toward a rational use of simulation to support aviation training. International Journal of Aiation Psychology 8: 197–208
Simultaneous Equation Estimates (Exact and Approximate), Distribution of Schmidt R A, Bjork R A 1992 New conceptualizations of practice: Common principles in three paradigms suggest new concepts for training. Psychological Science 3: 207–17. Smith E, Ford J K, Kozlowski S 1997 Building adaptive expertise: Implications for training design. In: Quinones M A, Ehrenstein A (eds.) Training for a Rapidly Changing Workplace: Applications of Psychological Research. APA, Washington, DC Volpert W, Frommann R, Munzert J 1984 Die Wirkung heuristischer Regeln im Lernprozeß. Zeitschrift fuW r Arbeitswissenschaft 38: 235–40 Von Papstein P, Frese M 1988 Transferring skills from training to the actual work situation: The role of task application knowledge, action styles, and job decision latitude. In: Soloway E, Frye D, Shepard S B (eds.) Human Factors in Computing Systems, ACM SIGCHI Proceedings, CHI’ 88, pp. 55–60
B. Hesketh and M. Frese
Simultaneous Equation Estimates (Exact and Approximate), Distribution of A simple example of a system of linear simultaneous equations may consist of production and consumption functions of a nation: Y l ajbKjcLjerror, and C l djeYjerror. The variables Y, K, L, and C represent the gross domestic product (GDP), the capital equipment, the labor input, and the consumption, respectively. These variables are measures of the level of economic activity of a nation. In the production function, Y increases if the inputs K and\or L increase. C increases if Y increases in the consumption equation. Each equation is modeled to explain the variation in the left-hand side ‘explained’ variable by the variation in the right-hand side ‘explanatory’ variables. Error terms are added to analyze numerically the effect of the neglected factors from the right-hand side of the equation. These equations are different from the regression equations since the ‘explained’ variable Y is the ‘explanatory’ variable in the C equation, and Y and C are simultaneously determined by the two equations. Estimation of unknown coefficients and the properties of estimation methods are not straightforward compared with the ordinary least squares estimator. In practice, this kind of simultaneous equation system is extended to include more than 100 equations, and regularly updated to measure the economic activities of a nation. It is indispensable to analyze numerically the effect of policy changes and public investments. In this article, the statistical model and the estimation methods of all the equations are first explained, followed by the estimation methods of a
single equation and their asymptotic distributions. Explained next are the exact distributions, the asymptotic expansions, and the higher order efficiency of the estimators.
1. The System of Simultaneous Equations and Identification of the System We write the structural form of a system consisting of G simultaneous equations as yi l Yi βijZi γijui ( Yi Zi )δijui,
i l 1,…, G (1)
where yi and Yi are 1 and Gi subcolumns in the TiG matrix of whole endogenous variables Y l ( yi, Yi, Yei), Zi consists of Ki subcolumns in the TiK matrix of whole exogenous variables Z, βi and γi are Gii1 and Kii1 column vectors of unknown coefficients, δi l ( βi , γi )h, and ui is the Ti1 error term. This system of G equations with T observations is frequently summarized in a simple form YBjZΓ l U
(2)
the ith column of which is yikYiβikZiγi l ui, i.e., Eqn. (1). The ith columns of B and Γ may be denoted as bi and ci where (GkGik1) and (KkKi) elements are zero so that yikYiβikZiγi l YbijZci. Zero elements are called zero restrictions. It is assumed that each row of U is independently distributed as N(0, Σ). The reduced form of Eqn. (2) is Y l ZΠjV
(3)
where the KiG reduced form coefficient matrix is Π l kΓB−", and the TiG reduced form error term is V l UB−". Each row of V is assumed to be independently distributed as N(0, Ω), and then Σ l B−", ΩB−". The definition Π l kΓB−", or kΠB l Γ is the key to identify structural coefficients. Coefficients in βi are identified if they can be uniquely determined by the equation kΠbi l ci given Π. This equation is reduced to kΠ (1, βi )h l 0 denoting the (KkKi)i(1jGi) ! in Π as Π . (Rows and columns are selected submatrix ! elements in c and non-zero according to the zero i elements in bi, respectively.) Given Π , this includes ! (KkKi) linear equations and Gi unknowns, and βi is solvable if rank (Π ) l Gi. This means (KkKi) must ! KkK kG must be at least 0. If be at least Gi, or L l i i L l 0, βi is uniquely determined. For the positive L, there are L linearly dependent rows in Π since only Gi ! rows are necessary to determine βi uniquely. Once βi is determined, γi is determined by other Ki equations through kΠbi l ci. L is called the number of the degrees of overidentifiability of the ith equation. 14101
Simultaneous Equation Estimates (Exact and Approximate), Distribution of Structural coefficients are not uniquely estimable if they are not identified (see Statistical Identification and Estimability).
2. Estimation Methods of the Whole System and the Asymptotic Distribution The full information maximum likelihood (FIML) estimator of all nonzero structural coefficients δi, i l 1,…, G, follows from Eqn. (3). Since it is in a linear regression form, the likelihood function can first be minimized with respect to Ω. Once Ω is replaced by the first-order condition, the likelihood function is concentrated where only B and Γ are unknown. The concentrated likelihood function is proportional to lnQ ΩR Q, ΩR l (YkZΠR)h(YkZΠR)\T, ΠR l kΓB−",
(4)
and all zero restrictions are included in B and Γ matrices. In the FIML estimation, it is necessary to minimize Q ΩR Q with respect to all non-zero structural coefficients. The FIML estimator is consistent, and the asymptotic distribution is derived by the central limit theorem. Stacking δi, i l 1,…, G in a column vector δ, the FIML estimator δ# asymptotically approaches N(0, kI−") as follows: NT(δV kδ) D N(0, kI−"),
I l lim T
_
0
1
1 c#lnQΩRQ . (5) E T cδ cδh
I is the limit of the average of the information matrix, i.e., kI−" is the asymptotic Cramer–Rao lower bound. Then the FIML estimator is the best among consistent and asymptotically normal (BCAN) estimators. The right-hand side endogenous variable Yi in (1) is defined by a set of Gi columns in (3) such as Yi l ZΠijVi. By the definition of V, Yi or, equivalently, Vi is correlated with ui since columns in U are correlated with each other. The least squares estimator applied to (1) is inconsistent because of the correlation between Yi and ui. Since Z is assumed to be not correlated with U in the limit, Z is used as K instruments in the instrumental variable method estimator. Premultiplying Zh to (1), it follows that Zhyi l (ZhYi, ZhZi)δijZhui l (0,…, 0, ZhYi, ZhZi, 0,…, 0)δju*i , i l 1,…, G (6) where the Ki1 transformed right-hand side variables ZhYi is not correlated with u*i in the limit. Stacking all G transformed equations in a column form, the G equations are summarized as w l Xδju* where w and u* stack Zhyi and u*i , i l 1,…, G, respectively, and are GKi1. The covariance between u*i and u*j is σij(ZhZ) which is the ith row and jth column sub-block in the 14102
covariance matrix of u*. (The whole covariance matrix can be written as Σ"(ZhZ) where " signifies the Kroneker product.) Once Σ is estimated consistently (by the 2SLS method explained in the next section), δ is efficiently estimated by the generalized least squares method δV
l oXh[Σp −""(ZhZ)−"]Xq−"oXh[Σp −""(ZhZ)−"]wq. (7) $SLS This is the three-stage least squares (3SLS) estimator by Zellner and Theil (1962). The assumption of the normal distribution error is not required in this estimation. The 3SLS estimator is consistent and is BCAN since it has the same asymptotic distribution as the FIML estimator.
3. Estimation Methods of a Single Equation and the Asymptotic Distribution An alternative way of estimating structural coefficients is to pick up one structural equation, the ith, in a Gequation system, and estimate δi neglecting zero restrictions in other equations. Because of this, the other (Gk1) structural equations can be rewritten equivalently as (Gk1) reduced form equations. The limited information maximum likelihood (LIML) estimator by Anderson and Rubin (1949) applies the FIML method to a (1jGi)-equation system consisting of (1) and Yi l ZΠijVi. This means the first column and the second Gi columns of B are (1, kβi )h and (0, I )h, respectively, and the first column and the second Gi columns of Γ are (γi , 0h)h and Πi, respectively. (If we denote Y as (yi,Yi,Yei), Yei is weakly exogenous in estimating (1) or, equivalently, the structural and reduced form parameters of ( yi, Yi) given Yei are variation-free from the reduced form coefficient of Yei. Then Yei is omitted from (2), (3), (4), and (8).) Using these limited information B and Γ matrices, we minimize Q Ω< R Q with respect to δi and Πi. Defining PF l F(FhF )−"Fh for any full column matrix and Cl F, G l (yi, Yi)h(PZkPZ )(yi, Yi), i (yi, Yi)h(IkPZ)(yi, Yi), it turns out that βi is estimated by minimizing the least variance ratio λ( βi) l
(1, kβi )G(1, kβi )h . (1, kβi )C(1, kβi )h
(8)
γi is estimated by the least squares method applied to (1) replacing βi with the estimator. The two stage least squares (2SLS) estimator is the generalized least squares estimator applied to Eqn. (6) using ZhZ as the weight matrix. (See Eqn. (10), where k is set to 1.) The assumption of the normal distribution of the error term is not required in this estimation. Both LIML and 2SLS estimators are consistent, and the large sample distribution is NT(δV kδ ) D N(0,kI−"(δ , δ )) i i i i
(9)
Simultaneous Equation Estimates (Exact and Approximate), Distribution of where I is calculated similarly to Eqn. (5) but ΩR is defined using the limited information B and Γ matrices, and only the diagonal submatrix of IV" which corresponds to δi is used. (Partial derivatives are calculated with respect to δi and columns in Πi.) This asymptotic distribution is a particular case of Eqn. (5). Both estimators are consistent and BCAN under the zero restrictions imposed on Eqn. (1). The k-class estimator δ# i(k) unifies both LIML and 2SLS estimators (Theil 1961). It is δV i(k) l F
E
YihPZYik(kk1)Yi(IkPZ)Yi YihZi ZihYi ZihZi E
i F
YihPZk(kk1)Yih(IkPZ) Zih
−" G
H
H
G
yi. (10)
This is the least squares estimator, the 2SLS estimator, and the LIML estimator when k is 0, 1, and 1jλ, respectively. There are two important properties in the k-class estimator. It is consistent if p limT _k l 1, and is BCAN if p limT _NT(kk1) l 0. If k satisfies these conditions, the k-class estimator is consistent and BCAN even when Yi (IkPZ)Yi and Yi (IkPZ)yi are replaced with any matrix and a vector of order OP(T ).
4. Exact Distributions of the Single-equation Estimators Several early studies compared the bias and mean squared errors of OLS, LIML, 2SLS, FIML, and 3SLS estimators by the Monte Carlo simulations since all but OLS estimators are consistent and are indistinguishable. The OLS estimator was often found as reliable as other consistent estimators. Later, the studies went on to t ratios, and the real defect of OLS estimators was found: the deviation from the standard normal distribution is worse than any other simultaneous equation methods. See Cragg (1967) for related papers. Drawing general qualitative comparisons from simulations is difficult since simulations require setting values of all population parameters. Simulation studies on the small sample properties led to the derivation of the exact distributions, which were expected to permit the drawing of general comparisons without depending on the particular parameter values. If ni1 column vectors xt, t l 1,…, T are independently distributed N(mt, Ω), the density function of Σt = ,T xtxt is the non-central Wishart matrix denoted " (T, Ω, M) where the non-centrality parameter is as W n M l Σt = ,T mtmt (Anderson 1958, Chap. 13). " of the exact distribution of the singleThe study equation estimators started from the fact that G l (Gkl) and C l (Ckl), k, l l 1,…, 1jGi in Eqn. (8)
are the noncentral Wishart matrix W +G (KkKi, " i matrix Ω, M) with the noncentrality parameter M l ΠRZh(PZkPZ )ZΠR, and the central Wishart i matrix W +G (TkK, Ω, 0), respectively. The 2SLS, i OLS, and" LIML estimators of βi are G−" G , #" "(G k (G jC )−"(G jC ), and (G kλC )−## #" and #" λ is the minimum ## ## in#"the ## λC## ), respectively, root #" polynomial equation Q GkλC Q l 0. Since all estimators are functions of elements in the G and C matrices, their distributions can be characterized by degrees of freedom of the two Wishart matrices, M and Ω matrices. In deriving the exact density functions, the 2SLS and OLS estimators can be treated in a similar way. In the 2SLS estimator, the joint density of G is transformed into the joint density of G−" G , G kG G−" ## G , and G . Integrating out G ##kG#" G ""−"G "#and #" ## "" "# ## #" − G results in the joint density of G "G . The resulting ## #" density function includes infinite##terms, and zonal polynomials when Gi is greater than one. A pedagogical derivation is found in Press (1982, Chap. 5). In the LIML estimator, the joint density of G and C is transformed into that of characteristic roots and vectors. Since β# is rewritten as a ratio of elements in a characteristic vector, the density function is derived by integrating out unnecessary random variables from the joint density function. However, the analytical operations are not easy when there are many endogenous variables. See Phillips (1983, 1985) to find comprehensive reviews on the 2SLS and LIML estimators, respectively. It was somewhat fruitless to derive exact distributions because these include nuisance parameters and infinite terms. It was difficult to draw general conclusions on the qualitative properties of the estimators from the numerical evaluations of these distributions. See Anderson et al. (1982). Qualitative properties of the estimators followed from the exact moments of estimators. Kinal (1980) proved that the (fixed) k-class estimator in a multiple endogenous variables case has moments up to (TkKikGi) if 0 k 1, and up to L if k l 1. Mariano and Sawa (1972) proved that, in the Gi l 1 case, the mean and variance of the LIML estimator do not exit. (In Monte Carlo simulations, the bias of LIML estimators was often found to be smaller than that of others even though the exact mean is infinite. This showed the clear limitation of the simulation methods.)
5. Asymptotic Expansions of the Distributions of the Single-equation Estimators Asymptotic expansion of the distribution was introduced as an analytical tool which is more accurate than the asymptotic distribution but is less complicated than the exact distributions. For instance, the 14103
Simultaneous Equation Estimates (Exact and Approximate), Distribution of t ratio statistic, say X, which is commonly used in econometrics, has the density function f (x) l c:[1j(x#\m)]−("+m)/# where c is a constant and m is the degree of freedom under conditions including the normally distributed error terms. Since the mean and variance of X is 0 and m\(m–2), the standardized statistic Z l N(mk2)\m:X has the density f(z) l ch:[1j(z#\(mk2))]−("+m)/# where ch is a new constant. This density function is expanded to the third-order term as f(z) l φ(z)o1j
1 (z%k6z#j3)qjo(m−") 4m
(11)
where φ(z) is the standard normal density function, and the constant is adjusted so that the area under the curve is one. (Since the t distribution is symmetric around the origin, the O(1\Nm) term does not appear in the right-hand side of the equation.) Rewriting in terms of X, the asymptotic expansion of the t statistic with m degrees of freedom is f (x) l φ(x)j
1 (x%k2x#k1)φ(x)jo(m−") (12) 4m
The first term in the right-hand side is the n(0, 1) density function. The second term in the right-hand side converges to zero if the limit is taken with respect to m. This second term gives deviation of f (x) from φ(x), and is called the third-order term. For finite m, the third-order term is expected to improve the standard normal approximation. The numerical evaluation of this expansion is easy. The asymptotic expansions in the simultaneous equation estimators are long and include nuisance parameter matrices such as q below. See Phillips (1983) for a review and Phillips (1977) for the validity of expansion. The asymptotic expansion does not require the assumption of the normal distribution of the error term. Fujikoshi et al. (1982) gave the expansion of the joint density of the estimators of δi. In their study, the bias of the estimators is calculated from the asymptotic expansions as AM(NT(δV ikδi)) l
1 (dLk1)I−"qjo(T−"), NT ql
1 σ# F
E
kΩ βi #" ## 0
G
, (13) H
where δ# is the estimator, d is 1 and 0 fo1r 2SLS and LIML estimators, respectively, and the Ω matrix is partitioned into submatrices conformable with the partitions of G and C matrices. AM(:) stands for the mean operator, but uses the asymptotic expansion for the density function. (Recall that the exact mean does 14104
not exist in the LIML estimator.) It is possible to compare the two estimators in terms of the calculated bias. For example, the bias of the 2SLS estimator is 0 when the degree of overidentifiability L is 1. Further, the mean of the squared errors was calculated from the asymptotic expansions and used to compare estimators. It was proved that the mean squared error of the 2SLS estimator is smaller than that of the LIML estimator when L is less than or equal to 6. Historically, this kind of comparison of ‘approximate’ mean squared errors goes back to the ‘Nagar expansion’ by Nagar (1959) and the ‘small disturbance expansion’ by Kadane (1971). These qualitative comparisons gave researchers some guidance about the choice of estimators. It was interesting to examine the accuracy of the approximations calculated by the asymptotic expansions of distributions. If the asymptotic expansions were accurate, calculation of the exact distributions could be avoided, and properties of estimators could be found easily from the approximations. However, the approximations were found not to be accurate enough to replace the exact distributions. There are many cases where the asymptotic expansion is accurate when the asymptotic distribution, the first term in the expansion, is already accurate; the asymptotic expansion is inaccurate when the asymptotic distribution is inaccurate. In particular, the asymptotic expansions are inaccurate when (a) The value of asymptotic variance is small; (b) The value of L is large; or (c) The structural error term is highly correlated with the right-hand side endogenous variable. It is noted, first, ZhZ\T is assumed to converge a nonsingular fixed matrix in the asymptotic theory. Second, for an attempt to improve the accuracy of asymptotic distributions by incorporating large L values, see Morimune (1983). Third, the asymptotic expansions cannot trace closely the skewed exact distributions that happen particularly when the correlation is high.
6. Higher-order Efficiency of the System and the Single-equation Estimators The asymptotic expansions of distributions can be used in comparing the probabilities of concentration of estimators about the true parameter values. One estimator is more desirable than another if its probability is greater than that of the other. This measure was used in comparing the single-equation estimators, and some qualitative results were derived. Furthermore, the third-order efficiency criterion was brought in the comparisons. This criterion requires that estimators be adjusted to have the same asymptotic bias as in Eqn. (13). Then the adjusted estimators are compared, and the maximum likelihood
Simultaneous Equation Estimates (Exact and Approximate), Distribution of estimator is proved most efficient. It has the highest concentration about the true parameter values in terms of the asymptotic expansion of distribution to the third-order O(1\T ) terms. (Akahira and Takeuchi (1981), for example.) The adjusted maximum likelihood estimator has the smallest mean-squared error at the same time since the difference among estimators is found only in the mean-squared errors. In whole system estimation, the FIML estimator is third-order efficient. The 3SLS estimator is less efficient than the FIML estimator in terms of the asymptotic probability of concentration once the bias of the two estimators is adjusted to be the same. Morimune and Sakata (1993) derived a simple adjustment of the 3SLS estimator so that the adjusted estimator has the same asymptotic expansion as the FIML estimator to the third-order O(1\T ) terms. This estimator is explained by modifying Eqn. (7). In Eqn. (6), Yi is replaced by Y< i Z Π< i l Z(ZhZ)−" ZhYi so that the X matrix consists of ZhY< i and ZhZi. In the modified estimator, we estimate Σ and Π by the first round 3SLS estimator and replace Y< i in X by YM i ZΠM i where ΠM i consists of proper subcolumns in ΠM l kΓ< B< −". The new X matrix is denoted as XM . Finally, the new estimator is δg M SLS l $ oX h[Σ −""(ZhZ)−"]Xq−"oX h[Σ −""(ZhZ)−"]wq. (14) This estimator has the same asymptotic expansion as the FIML estimator to the third-order terms and is third-order efficient. The LIML and 2SLS estimators are simple cases of the FIML and 3SLS estimators, respectively. Then the modified 2SLS estimator which follows from Eqn. (14) has the same asymptotic expansion as the LIML estimator to the third-order term. The LIML estimator and the modified 2SLS estimator are third-order efficient in the single-equation estimation.
7. Conclusion Laurence Klein received the 1980 Nobel prize for the creation of a macroeconometric model which is an empirical form of a simultaneous equation system, and for the application to the analysis of economic fluctuations and economic policies. The macroeconometric model became a standard tool to analyze the economies and policies of nations. Trygve Haavelmo received the 1989 Nobel prize for his contribution in the analysis of simultaneous structures. Haavelmo, together with other researchers at the Cowles Commission for Research in Economics, then at the University of Chicago, became the founders of simultaneous equation analysis in econometrics. Part of their research is collected in Hood and Koopmans
(1953). Studies on the exact and approximate distributions of estimators came after the research conducted at the Cowles group, and helped to make econometrics rigorous. Access to computers was the main concern when econometric model-building started spreading all over the world in the 1970s. Since then, computer facilities surrounding econometric model-building have changed greatly. Bulky mainframe computer have been replaced by personal computers. Computer programs were written individually, mostly in Fortran, and were used for regression analyses in model estimation as well as for simulation studies in econometrics theory. The packaged least squares programs replaced the individually written programs later in model estimation. They run on personal computers and have facilitated greatly the conducting of empirical studies. See also: Simultaneous Equation Estimation: Overview
Bibliography Akahira M, Takeuchi K 1981 Asymptotic Efficiency of Statistical Estimators: Concepts and Higher Order Asymptotic Efficiency. Springer, New York Anderson T W 1958 An Introduction to Multiariate Statistical Analysis. Wiley, New York Anderson T W, Rubin H 1949 Estimation of the parameters of a single equation in a complete system of stochastic equations. Annals of Mathematical Statistics 20: 46–63 Anderson T W, Kunitomo N, Sawa T 1982 Evaluation of the distribution function of the limited information maximum likelihood estimator. Econometrica 50: 1009–27 Basmann R L 1957 A generalized classical method of linear estimation of coefficients in a structural equation. Econometrica 25: 77–83 Cragg J G 1967 On the relative small sample properties of several structural equation estimators. Econometrica 35: 89–110 Fujikoshi Y, Morimune K, Kunitomo N, Taniguchi M 1982 Asymptotic expansions of the distributions of the estimates of coefficients in a simultaneous equation system. Journal of Econometrics 18: 191–205 Hood W C, Koopmans T C 1953 Studies in Econometric Methods. Wiley, New York Kadane J B 1971 Comparison of k-class estimates when the disturbance is small. Econometrica 39: 723–37 Kinal T W 1980 The existence of moments of k-class estimators. Econometrica 48: 241–9 Mariano R S, Sawa T 1972 The exact finite sample distribution of the limited information maximum likelihood estimator in the case of two included endogenous variables. Journal of the American Statistical Association 67: 159–63 Morimune K 1983 Approximate distributions of k-class estimators when the degree of over-identifiability is large compared with the sample size. Econometrica 51: 821–41 Morimune K, Sakata S 1993 Modified three stage-least squares estimator which is third-order efficient. Journal of Econometrics 57: 257–76
14105
Simultaneous Equation Estimates (Exact and Approximate), Distribution of Nagar A L 1959 The bias and moment matrix of the general kappa-class estimators of the parameters in simultaneous equations. Econometrica 27: 575–95 Phillips P C B 1977 The general theorem in the theory of asymptotic expansions as approximation to the finite sample distributions of econometric estimators. Econometrica 45: 1517–34 Phillips P C B 1983 Exact Phillips. In: Griliches Z, Intriligator M D (eds.) Handbook of Econometrics. North-Holland, Amsterdam, Vol. 1, Chap. 8 Phillips P C B 1985 The exact distribution of LIML: 2. International Economic Reiews 25(1): 249–61 Press S J 1982Applied Multiariate Analysis: Using Bayesian and Frequentist Methods of Inference [original edn. 1972, 2nd edn.] Robert E Krieger, Malabar, FL Theil H 1961 Economic Forecasts and Policy, 2nd edn. NorthHolland, New York, pp. 231–2, 334–6 Zellner A, Theil H 1962 Three-stage least squares: simultaneous estimation of simultaneous equations. Econometrica 30: 54–78
K. Morimune
Simultaneous Equation Estimation: Overview
unknown constant parameters. Suppose that market equilibrium, in which the price is such that suppliers want to sell the same quantity that demanders what to buy, is described by these linear equations: q l γ jβ pju , β 0 (1) " " " " 0 (2) demand: q l γ jδ yjε wjβ pju , β # # # # # # Neither equation alone can determine either p or q, because each equation contains both: these are simultaneous equations. Solve them for p and q thus: supply:
p l π jπ yjπ wj(u ku )\∆ "" "# "$ # " q l π jπ yjπ wj( β u kβ u )\∆ #" ## #$ " # # "
(3) (4)
where π l (γ kγ )\∆, π l δ \∆, π l ε \∆ "" # " "# # "$ # π l ( β γ kβ γ )\∆, π l β δ \∆, #" " # # " ## " # π l β ε \∆ #$ "# and ∆ l β kβ " #
(5) (6)
(7)
Simultaneous equations are important tools for understanding behavior when two or more variables are determined by the interaction of two or more relationships in such a way that causation is joint rather than unidirectional. Such situations abound in economics, but also occur elsewhere. Haavelmo (1943, 1944) began their modern econometric treatment. The simplest economic example is the interaction of buyers and sellers in a competitive market which jointly determines the quantity sold and price. Another important example is the interaction of workers, consumers, investors, firms, and government in determining the economy’s output, employment, price level, and interest rates, as in macroeconometric forecasting models. Even when one is interested only in a single equation, it often is best interpreted as one of a system of simultaneous equations. Strotz and Wold (1960) argued that in principle every economic action is a response to a previous action of someone else, but even they agreed that simultaneous equations are useful when the data are yearly, quarterly, or monthly, because these periods are much longer than the typical market response time. This article discusses the essentials of simultaneous equation estimation using a simple linear example.
The variables p and q which are to be explained by the model are endogenous. Equations (1) and (2) are structural equations. Each of them describes the behavior of one of the building blocks of the model, and (as is typical) contains more than one endogenous variable. Equations (3) and (4) are reduced form equations. Each of them contains just one endogenous variable, and determines its value as a function of parameters, shocks, and explanatory variables (here y and w). The explanatory variables are predetermined if the shocks for any given period are independent of the explanatory variables for that period and all previous periods; they are exogenous if the shocks for each period are independent of the explanatory variables for every period. Thus all exogenous variables are predetermined, but not conversely. For example, the value of an endogenous variable from a previous period cannot be exogenous, but it is predetermined if the shocks are serially independent.
1. A Simple Supply–Demand Example
3. The Need for Estimation of Parameters
Let q and p stand for the quantity sold and price of a good, y and w for the income and wealth of buyers (assumed independently determined), u and u for " letters# for unobservable random shocks, and Greek
One job of a reduced form equation is to tell how an endogenous variable responds to a change in any predetermined or exogenous variable. Typically, no single structural equation can do this. Another job of
14106
2. Types of Variables: Structural and Reduced Form Equations
Simultaneous Equation Estimation: Oeriew a reduced form equation is to forecast the value of an endogenous variable in a future period, based on expected future values of the predetermined and exogenous variables and of the shocks (the latter can safely be set at zero in most cases). Typically, no single structural equation can do this either. If the reduced form equations are to do these jobs, numerical values of their parameters are needed (in this example, values of the πs in Eqns. (3) and (4)). Numerical values of the structural parameters are needed as well (in this example, the βs, γs, δ and ε in # Eqns. (1) and (2)). There are several reasons.#First, one wants to understand each of the system’s separate building blocks in its own right. Second, in overidentified cases (see Sect. 7) better estimates of the reduced form can be obtained by solving the estimated structure (as in Eqns. (5)–(7) than by estimating the reduced form directly. Third, if forecasts made by the reduced form are poor, one wants to know which of the structural equations fit the forecastperiod’s data poorly, and which (if any) fit well. This is discovered by inserting observed values of the variables into each estimated structural equation to obtain an estimate of its forecast-period shock. Then one can revise the poorly-fitting structural equation(s) and so try to improve the model’s accuracy (see Sect. 11). Fourth, if one wants to find the new values of the reduced form parameters after a change in a structural parameter, one needs to know the old values of all the structural parameters, as well as the new value of the one that has changed. In the example, one can then use Eqns. (5)–(7) to compute the reduced form parameters. The critique of Robert Lucas (1976) provides a warning about pitfalls in doing this.
4. Least Squares Estimators Least squares (LS) estimators (see Linear Hypothesis) of an equation’s coefficients are biased if the shocks in each period are not independent of the explanatory variables in all periods. Clearly this is true of Eqns. (1) and (2), since Eqns. (3) and (4) show that both u and " of u influence both p and q. This is typically true # estimators of simultaneous structural equations, LS However, LS estimators often have small variances, so they may sometimes have acceptably small expected squared errors even when they are biased. LS estimators of the π’s in the reduced form, Eqns. (3) and (4) are unbiased if the shocks in each period have zero mean and constant variance, and are independent of y and w for all periods (so that y and w are exogenous). They have minimum variance among unbiased estimators and are consistent if in addition the shocks are uncorrelated across time. They remain consistent if the shocks are uncorrelated across time but y and w are predetermined rather than exogenous. The generalized least squares (GLS) method is minimum variance unbiased if the explanatory vari-
ables are exogenous but the shocks are correlated across time. This method requires information about the variances and covariances of the shocks.
5. The Identifiability of Structural Parameters Having estimates of the reduced form parameters, one can try to make them yield estimators of the structural parameters. In the example this means trying to solve the estimated version of Eqns. (5) and (7) for estimators of the γ’s, β’s, δ and ε . Denote the LS estimators # # and (6) show that π# \π# of the π’s by π# . Equations (5) ## of "# is one estimator of β , and π# \π# is another. Either " #$ γ "$in Eqn. (1). These are them leads to an estimator of " indirect least squares (ILS) estimators. They are not unbiased, but they are consistent if the LS reduced form estimators are consistent. The supply parameters β and γ are identified, meaning that the data and the " " model reveal their values (subject to sampling variation). Indeed, they are oeridentified, because in small samples the two ways to estimate β from π# ’s yield " different results. (If either δ and ε were zero, there # # would be only one way, and the supply equation would be just identified. If both δ and ε were zero, # equation # there would be no way, and the supply would be unidentified.) Equations (5)–(7) have an infinite number of solutions for the demand parameters β , γ , # # δ and ε . Hence the data and the model do not reveal # # their values; they are unidentified. Another way to see this is to imagine price and quantity data being generated by intersections of the supply and demand curves in the pq plane. Ignoring shocks, the supply curve is fixed, but the demand curve shifts when y or w changes. Hence the intersections reveal the slope and position of the supply curve as the demand curve shifts. But they reveal nothing about the slope of the demand curve. The identifiability of parameters is crucial: they cannot be estimated if they are not identified. Reduced form parameters are usually identified, except in special cases. But the identifiability of structural parameters should be checked before one tries to estimate them. The least squares formula can be applied to an unidentified structural equation such as Eqn. (2), but the result is not an estimator of that equation. For a more detailed and general discussion, see Statistical Identification and Estimability.
6. Simultaneous Equations Estimation Methods Many methods have been developed for estimating identifiable parameters of simultaneous structural equations without the bias and inconsistency of LS. Most of them exploit the assumed independence between shocks and predetermined variables. Some estimate one equation at a time, and others estimate the whole system at once. 14107
Simultaneous Equation Estimation: Oeriew
7. Estimating One Structural Equation at a Time The ILS method mentioned above is one way of estimating one equation that is part of a system. A second way is the instrumental variables (IV) method (Durbin 1954). LS and IV will be shown for Eqn. (1). Denote the deviation of each variable from its sample mean by an asterisk, for example, p* l pkp- . Now rewrite Eqn. (1) in terms of deviations from sample means, thus eliminating the constant term γ ; multiply " it by p*; sum it over all the sample observations; and divide the sum byp*#. The result is the LS estimator of β : " βV l q*p*\p*# l β ju* p*\p*# " " "
(8)
It is biased and inconsistent because the shock u influences p via Eqn. (3) so that the error term, the last" term in Eqn. (8), does not equal zero in expectation or in probability limit. Now rewrite Eqn. (1) as before in terms of deviations from sample means, but this time multiply it by y*, sum it over all observations, and divide by p*y*. The result is an IV estimator of β : " βV y l q*y*\p*y* l β ju* y*\p*y* (9) " " " If y is predetermined it is an instrument. In this case the IV estimator is consistent if u has zero mean and constant variance and is serially"independent, because then u* y* is zero in probability limit and p*y* is " not. Similarly, if w is predetermined it is another instrument, and another IV estimator of β is β# w l " " are q*w*\p*w*. Clearly, these IV estimators equivalent to the ILS estimators of β obtained earlier. " reduced form For example, the one based on the coefficients of y is πV \πV l (q*y*\y*#)\(p*y*\y*#) ## "# l q*y*\p*y* l βV
"y
(10)
Similarly, the other one is π# \π# l β# w. For a reduced "$ as LS, " form equation, IV is the #$same because the instruments are precisely the equation’s predetermined variables. The two-stage least squares (2SLS) method (Theil 1972, Basmann 1957) is another way of estimating one structural equation at a time. For an overidentified equation it is superior to ILS or IV because it results in one estimator rather than two or more; for a just identified equation it reduces to ILS and IV. It first computes the LS estimator of the reduced form equation for each endogenous variable that appears on the right side of the structural equation to be 14108
estimated (this is stage 1); it then replaces the observed data for those endogenous variables by the values calculated from the reduced form, and computes the LS estimator for the resulting equation (this is stage 2). For Eqn. (1), stage 1 is to compute the least squares estimators of the π’s in the price equation (3) of the reduced form; the second stage is to compute p# l π# jπ# yjπ# w, substitute this p# for p in (1), and "" "# the "$ compute LS estimator q*p# *\p# *#, which is the 2SLS estimator of β . 2SLS is an IV estimator, its " instruments are the observed values of the equation’s predetermined variables and the reduced-form-calculated values of the endogenous variables from the equation’s right side. 2SLS is a generalized method of moments (GMOM) estimator (Hansen 1982). LS and IV are plain MOM estimators. The k-class estimator (Nagar 1959) includes as special cases LS, 2SLS, and the limited information maximum likelihood estimator (LIML) (Anderson and Rubin 1949, 1950). LIML is similar to 2SLS (the two are the same for infinitely large samples), but it has largely been displaced by 2SLS because it requires iterative computations which 2SLS does not, and because it often has a larger variance and sometimes yields outlandish estimates (Kadane 1971, Theil 1972). For an identified structural equation, ILS, IV, LIML, and 2SLS are consistent if the model’s explanatory variables are predetermined and the shocks have zero means and constant variances and covariances and are independent across time. An advantage of these methods is that, unlike LS, if applied to an unidentified structural equation they fail to yield estimates. The Bayesian method of moments (BMOM) method (Zellner 1998) obtains estimators of linear reduced form and structural equations without making assumptions about the likelihood function of the data. It assumes that the posterior expectations of the shocks given the data are uncorrelated with the predetermined variables, and that the differences between actual and estimated shocks have a covariance matrix of a particular form. Zellner shows how to find optimum BMOM estimators for several different loss functions, including a precision loss function which is a weighted sum of squares and cross-products of errors. For this loss function, the optimum BMOM estimator of an identified structural equation belongs to the k-class; it turns out to be LS if the ratio of the sample size to the number of predetermined variables in the model is 2, and it approaches 2SLS in the limit as this ratio grows without limit. When a structural equation is overidentified, restrictions are placed on the reduced form parameters (such as π \π l π \π in the example, because both #" equal $# β $"), but LS estimators of the reduced ratios##must " form ignore this information. Better estimators of the reduced form, because they use this information, can then be obtained by solving the estimated structure (see Sect. 3).
Simultaneous Equation Estimation: Oeriew
8. Estimating a Complete System If in an identified complete simultaneous equations system the shocks are normally distributed and other suitable conditions are satisfied, an asymptotically efficient but computationally complex method of estimating all the parameters of a complete system is the full information maximum likelihood (FIML) method (Koopmans 1950, Durbin 1988). It involves maximizing the joint likelihood function of all the data and parameters, and requires iterative computations. It has largely been displaced by the three stage least squares (3SLS) method (Zellner and Theil 1962), which is also asymptotically efficient but is much easier to compute. 3SLS is an application of the GLS method. 3SLS gets the required information on the shocks’ covariances from 2SLS estimates of the shocks; hence the name 3SLS.
9. Cross-section s. Time Series Studies So far the treatment has concerned time series data, where the observations describe the same country, city, family, firm, or what not, at successive periods of time. The same type of analysis is appropriate for cross-section data, where the observations describe different countries, firms, or what not, at a single point of time, with obvious modifications. The analysis has also been extended to panel data, that is, a time series of cross-sections.
10. The Decline of Simultaneous Equations Models In recent years the econometric literature has paid diminishing attention to simultaneous equations models and their identifiability. Many modern textbooks give these topics little space, and that only near the end of the book (Davidson and MacKinnon 1993). Perhaps one reason is that modern computers can handle the estimation of nonlinear models, for which identification criteria are much more complicated and for which lack of identifiability is a much rarer problem.
11. The Problem of Choosing and Testing a Model Thus far it has been presumed that the model being estimated is an accurate representation of the realworld process that actually generates the data. This is far from certain; at best it is likely to be only approximately true. Any model has been chosen by someone, perhaps with the aid of economic theory
about the maximization of profit or utility, or perhaps because it was suggested by previously observed data. But there is no guarantee that it contains the right variables, or that their endogeneity or exogeneity is correctly stated, or that the correct number of lagged variables has been chosen, or that its eqns. have the right mathematical form, or that the assumed distribution of its shocks is correct. One can perform diagnostic tests to see whether the estimated model fits past data well, and whether its calculated past shocks have constant variance and are free of any obvious systematic behavior. But this does not assure that it will do well with future data. In my view, the most stringent and most important test of a model is to expose it to data that were not available when the model was formulated. If it does not describe these data well, it leaves something to be desired. The more new data it can describe well, the more confidence one can have in it. But at any point the best that can be said about a model is that it has done a good job of describing the data that are available thus far. See also: Instrumental Variables in Statistics and Econometrics; Linear Hypothesis; Simultaneous Equation Estimates (Exact and Approximate), Distribution of; Statistical Identification and Estimability
Bibliography Anderson T W, Rubin H 1949 Estimation of the parameters of a single equation in a complete system of stochastic equations. Annals of Mathematical Statistics 20: 46–63 Anderson T W, Rubin H 1950 The asymptotic properties of estimates of parameters of a single equation in a complete system of stochastic equations. Annals of Mathematical Statistics 21: 570–82 Basmann R L 1957 A generalized classical method of linear estimation of coefficients in a structural equation. Econometrica 25: 77–83 Christ C F (ed.) 1994 Simultaneous Equations Estimation. Edward Elgar, Aldershot, UK Davidson R, MacKinnon J G 1993 Estimation and Inference in Econometrics. Oxford University Press, New York Durbin J 1954 Errors in variables. Reue of the International Statistical Institute 22: 23–32 Durbin J 1988 Maximum likelihood estimation of the parameters of a system of simultaneous regression equations. Econometric Theory 4: 159–70 Haavelmo T 1943 The statistical implications of a system of simultaneous equations. Econometrica 11: 1–12 Haavelmo T 1944 The probability approach in econometrics. Econometrica 12(suppl.): 1–115 Hansen L P 1982 Large sample properties of generalized method of moments estimators. Econometrica 50: 1029–54 Hausman J A 1983 Specification and estimation of simultaneous equation models. In: Griliches Z, Intriligator M D (eds.) Handbook of Econometrics. North-Holland, Amsterdam, Vol. 1, pp. 391–448
14109
Simultaneous Equation Estimation: Oeriew Kadane J B 1971 Comparison of k-class estimators when the disturbances are small. Econometrica 39: 723–7 Koopmans T C (ed.) 1950 Statistical Inference in Dynamic Economic Models. Cowles Commission Monograph 10. Wiley, New York Lucas R E Jr 1976 Econometric policy evaluation: a critique. Carnegie-Rochester Conference Series on Public Policy 1: 19–46 Nagar A L 1959 The bias and moment matrix of the general kclass estimators of the parameters in simultaneous equations. Econometrica 27: 575–95 Strotz R H, Wold H O A 1960 A triptych on causal chain systems. Econometrica 28: 417–63 Theil H 1972 Principles of Econometrics. Wiley, New York Zellner A 1998 The finite sample properties of simultaneous equations’ estimates and estimators Bayesian and nonBayesian approaches. Journal of Econometrics 83: 185–212 Zellner A, Theil H 1962 Three stage least squares: simultaneous estimation of simultaneous equations. Econometrica 30: 54–78
is also appropriate when one wishes to make an intensive study of a phenomenon in order to examine the conditions that maximize the strength of an effect. In clinical settings one is frequently interested in describing which variables affect a particular individual rather than in trying to infer what might be important from studying groups of subjects and assuming that the average effect for a group is the same as that observed in a particular subject. In the current era of increased accountability in clinical settings, single-case design can be used to demonstrate treatment effects for a particular clinical case (Hayes et al. 1999 for a discussion of the changing research climate in applied settings).
C. F. Christ
Single-case designs study intensively the process of change by taking many measures on the same individual subject over a period of time. The degree of control in single-case design experiments can often lead to the identification of important principles of change or lead to a precise understanding of clinically relevant variables in a specific clinical context. One of the most commonly used approaches in single-case design research is the interrupted time-series. A timeseries consists of many repeated measures of the same variable(s) on one subject, while measuring or characterizing those elements of the experimental context that are presumed to explain any observed change in behavior. An by examining characteristics of the data before and after the experimental manipulation to look for evidence that the independent variable alters the dependent variable because of the interruption.
Single-case Experimental Designs in Clinical Settings 1. Differing Research Traditions Two major research traditions have advanced the behavioral sciences. One is based on the hypotheticodeductive approach to scientific reasoning where a hypothesis is constructed and tested to see if a phenomenon is an instance of a general principle. There is a presumed a priori understanding of the relationship between the variables of interest. These hypotheses are tested using empirical studies, generally using multiple subjects, and data are collected and analyzed using group experimental designs and inferential statistics. This tradition represents current mainstream research practice for many areas of behavioral science. However, there is another method of conducting research that focuses intensive study on an individual subject and makes use of inductive reasoning. In this approach, one generates hypotheses from a particular instance or the accumulation of instances in order to identify what might ultimately become a general principle. In practice, most research involves elements of both traditions, but the extensive study of individual cases has led to some of the most important discoveries in the behavioral sciences and therefore has a special place in the history of scientific discovery. Single-case or single-subject design (also known as ‘N of one’) research is often employed when the researcher has limited access to a particular population and can therefore study only one or a few subjects. This method 14110
2. Characteristics of Single-case Design
3. Specific Types of Single-case Designs Single-case designs frequently make use of a graphical representation of the data. Repeated measures on the dependent variable take place over time, so the abscissa (X-axis) on any graph represents some sort of across time scale. The dependent measure or behavior presumably being altered by the treatment is plotted on the ordinate (Y-axis). There are many variations of single-case designs that are well described (e.g., Franklin et al. 1997, Hersen and Barlow 1976, Kazdin 1982), but the following three approaches provide an overview of the general methodology.
3.1 The A-B-A Design The classic design that is illustrative of this approach is called the A-B-A design (Hersen and Barlow 1976, pp. 167–97) where the letters refer to one or another
Single-case Experimental Design in Clinical Settings
Counts of disruptive behavior
Baseline
Change in contingent attention
Return to baseline
Series of class periods
Figure 1 A-B-A design
experimental or naturally observed conditions presumed to be associated with the behavior of importance. A hypothetical example is shown in Fig. 1. By convention, A refers to a baseline or control condition and the B indicates a condition where the behavior is expected to change. The B condition can be controlled by the experimenter or alternatively can result from some naturally occurring change in the environment (though only the former are true experiments). The second A in the A-B-A design indicates a return to the original conditions and is the primary means by which one infers that the B condition was the causal variable associated with any observed change from baseline in the target or dependent variable. Without the second A phase, there are several other plausible explanations for changes observed during the B phase besides the experimental manipulation. These include common threats to internal validity such as maturation or intervening historical events coincidental to initiating B. To illustrate the clinical use of such an approach, consider the data points in the first A to indicate repeated observations of a child under baseline conditions. The dependent variable of interest is the number of disruptive behaviors per class period that the child exhibits. A baseline of disruptive behaviors is observed and recorded over several class periods. During the baseline observations, the experimenter hypothesizes that the child’s disruptive behavior is under the control of contingent attention by the teacher. That is, when the child is disruptive, the teacher pays attention (even negative attention) to the child that unintentionally reinforces the behavior, but when the child is sitting still and is task oriented, the teacher does not attend to the child. The experimenter then trains the teacher to ignore the disruptive behavior and contingently attend to the child (e.g., praise or give some token that is redeemable for later privileges) when the child is emitting behavior appropriate to the classroom task. After training the teacher to reinforce differentially appropriate class-
room behavior, the teacher is instructed to implement these procedures and the B phase begins. Observations are made and data are gathered and plotted. A decrease in the number of disruptive behaviors is apparent during the B phase. To be certain that the decrease in disruptive behaviors is not due to some extraneous factor, such as the child being disciplined at home or simply maturing, the original conditions are reinstated. That is, the teacher is instructed to discontinue contingently attending to task-appropriate behavior and return to giving corrective attention when the child is disruptive. This second A phase is a return to the original conditions. The plot indicates that the number of disruptions returns to baseline (i.e., returns to its previous high level). This return to baseline in the second A condition is evidence that the change observed from baseline condition A to the implementation of the intervention during the B phase was under the control of the teacher’s change in contingent attention rather than some other factor. If another factor were responsible for the change, then the re-implementation of the original baseline conditions would not be expected to result in a return to the original high levels of disruptive behavior. There are many variations of the A-B-A design. One could compare two or more interventions in an A-BA-C-A design. Each A indicates some baseline condition, the B indicates one type of intervention, and the C indicates a different intervention. By examining differences in the patterns of responding for B and C, one can make treatment comparisons. The notation of BC together, as in A-B-A-C-A-BC-A, indicates that treatments B and C were combined for one phase of the study.
3.2 The Multiple-baseline Design An A-B-A or A-B-A-C design assumes that the treatment (B or C) can be reversed during the subsequent A period. Sometimes it is impossible or unethical to re-institute the original baseline conditions (A). In these cases, other designs can be used. One such approach is a multiple-baseline design. In a multiple-baseline design, baseline data are gathered across several environments (or behaviors). Then a treatment is introduced in one environment. Data continue to be gathered in selected other environments. Subsequently the treatment is implemented in each of the other environments, one at a time, and changes in target behavior observed (Poling and Grossett 1986). In the earlier example, if the child had been exhibiting disruptive behavior in math, spelling, and social studies classes, the same analysis of the problem might be applied. The teachers or the researcher might not be willing to reinstitute the baseline conditions if 14111
Single-case Experimental Design in Clinical Settings
Figure 2 Example of multiple baseline design with same behaviour across different settings
improvements were noted because of its disruptive effects on the learning of others and because it is not in the best interests of the child. Figure 2 shows how a multiple-baseline design might be implemented and the data presented visually. Baseline data are shown for each of the three classroom environments. While the initial level of disruptive behavior might be slightly different in each class, it is still high. The top graph in Fig. 2 shows the baseline number of disruptive behaviors in the math class. At the point where the vertical dotted line appears, the intervention is implemented and its effects appear to the right of dotted line. The middle and bottoms graphs show the same disruptive behaviors for the spelling and social studies class. No intervention has yet been implemented and the disruptive behavior remains high and relatively stable in each of these two classrooms. This suggests that the changes seen in the math class were not due to some other cause outside of the school environment or some general policy change within the school, since the baseline conditions and frequency of disruptive behaviors are unchanged in the spelling and social studies classes. After four more weeks of baseline observation, 14112
the same shift in attention treatment is implemented in the spelling class, but still not in the social studies class. The amount of disruptive behavior decreases following the treatment implementation in the spelling class, but not the social studies class. This second change from baseline is a replication of the effect of treatment shown in the first classroom and is further evidence that the independent variable is the cause of the change. There is no change in the social studies class behavior. This observation provides additional evidence that the independent variable rather than some extraneous variable is responsible for the change in the behavior of interest. Finally, the treatment is implemented in the social studies class with a resulting change paralleling those occurring in the other two classes when they were changed. Rather than having to reverse the salutary effects of a successful treatment as one would in an A-B-A reversal design, a multiple baseline design allows for successively extending the effects to new contexts as a means of demonstrating causal control. Multiple-baseline designs are used frequently in clinical settings because they do not require reversal of beneficial effects to demonstrate causality. Though Fig. 2 demonstrates control of the problem behavior by implementing changes sequentially across multiple settings, one could keep the environment constant and study the effects of a hypothesized controlling variable by targeting several individual behaviors. For example, a clinician could use this design to test whether disruptive behavior in an institutionalized patient could be reduced by sequentially reinforcing more constructive alternatives. A treatment team might target aggressive behavior, autistic speech, and odd motor behavior when the patient is in the day room. After establishing a baseline for each behavior, the team could intervene first on the aggressive behavior while keeping the preexisting contingencies in place for the other two behaviors. After the aggressive behavior has changed (or a predetermined time has elapsed), the autistic speech behavior in the same day room could be targeted, and so on for the odd motor behavior. The logic of the application of a multiplebaseline design is the same whether one studies the same behavior across different contexts or different behaviors across the same context. To demonstrate causal control, the behavior should change only when it is the target of a specific change strategy and other behaviors not targeted should remain at their previous baselines. Interpretation can be difficult in multiple-baseline designs because it is not always easy to find behaviors that are functionally independent. Again, consider the disruptive classroom behavior described earlier. Even if one’s analysis of teacher attention to disruptive behavior is reinforcing, it may be difficult to show change in only the targeted behavior in a specific classroom. The child may learn that alternative behaviors can be reinforced in the other classrooms and the child may
Single-case Experimental Design in Clinical Settings
Figure 3 Example from an Alternating Treatment Design (ATD)
alter his or her behavior in ways that alter the teacher reactions, even though the teachers in the other baseline classes did not intend to change their behavior. Nevertheless, this design addresses many ethical and practical concerns about A-B-A (reversal) designs.
rating of clinically useful responsiveness from the client. The therapist would have a random order for asking one of the two types of questions. The name of the design does not mean that the questions alternate back and forth between two conditions on each subsequent therapist turn, but randomly change between treatments. One might observe the data in Fig. 3. The X-axis is the familiar time variable and the Y-axis the rating of the usefulness of the client response. The treatment conditions are shown in separate lines. Note that points for each condition are plotted according to the order in which they were observed, but that the data from each condition are joined as if they were collected in one continuous series (even though the therapists asked the questions in the order of O-C-O-O-O-C-C-O …). The data in Fig. 3 indicate that both methods of asking questions produce a trend for the usefulness of the information to increase as the session goes on. However, the open-ended format produces an overall higher degree of utility.
4. Analyses 3.3 Alternating Treatment Design The previous designs could be referred to as withinseries designs because there are a series of observations within which the conditions are constant, an intervention with another series of observations where the new conditions are constant for the duration of series, and so on. The interpretation of results is based on what appears to be happening within a series of points compared with what is happening within another series. Another variation of the single-case design that is becoming increasingly recognized for its versatility is the alternating treatment design (ATD) (Barlow and Hayes 1979). The ATD can be thought of as a betweenseries design. It is a method of answering a number of important questions such as assessing the impact of two or more different treatment variations in a psychotherapy session. In an ATD, the experimenter rapidly and frequently alternates between two (or more) treatments on a random basis during a series and measures the response to the treatments on the dependent measure. This is referred to as a ‘betweenseries’ because the data are arranged by treatments first and order second rather than by order first, where treatments vary only when series change. There may be no series of one particular type of treatment, although there are many of the same treatments in a row. To illustrate how an ATD might be applied, consider a situation where a therapist is trying to determine which of two ways of relating to a client produces more useful self-disclosure. One way of relating is to ask open-ended questions and the other is to ask direct questions. The dependent variable is a
There are four characteristics of a particular series that can be compared with other series. The first is the level of the dependent measure in a series. One way in which multiple series may differ is with respect to the stable level of a variable. The second way in which multiple series may differ is in the trend each series shows. The third characteristic that can be observed is the shape of responses over time. This is also referred to as the course of the series and can include changes in cyclicity. The last characteristic that could vary is the variability of responses over time. Figure 4 shows examples of how these characteristics might show treatment effects. One might observe changes in more than one of these characteristics within and across series.
4.1 Visual Interpretation of Results In single-case design research, there has been a long tradition to interpret the data visually once it has been graphed. In part, this preference emerged because those doing carefully controlled single-case research were looking for large effects that should be readily observable without the use of statistical approaches. Figure 4 depicts examples of clear effects. They demonstrate the optimal conditions for inferring a treatment effect, a stable baseline and clearly different responses during the post baseline phase. Suggestions for how to interpret graphical data have been described by Parsonson and Baer (1978). However, there are many circumstances where patterns of results are not nearly so clear and errors in interpreting results are more common. When there are 14113
Single-case Experimental Design in Clinical Settings
Figure 4 Examples of changes observed in single-case designs
outlier data points (points which fall unusually far away from the central tendency of other points in the phase), it becomes more difficult to determine reliably if an effect exists by visual inspection alone. If there are carry-over effects from one phase to another that the experimenter may not initially anticipate, results can be difficult to interpret. Carry-over is observed when the effects of one phase do not immediately disappear when the experiment moves into another phase. For example, the actions of a drug may continue long after a subject stops taking the drug, either because the body metabolizes the drug slowly or because some irreversible learning changes occurred as a result of the drug effects. It is also difficult to interpret graphical results when there is a naturally occurring cyclicity to the data. This might occur when circumstances not taken into account by the experimenter lead to systematic changes in behaviors that might be interpreted as treatment effects. A comprehensive discussion of issues of visual analysis of graphical displays is provided by Franklin et al. (1996). 4.2 Traditional Measurement Issues Any particular observation in a series is influenced by three sources of variability. The experimenter intends to identify systematic variability due to the intervention. However, two other sources of variability in an observation must also be considered. These are variability attributable to measurement error and variability due to extraneous sources, such as how well the subject is feeling at the time of a particular observation. Measurement error refers to how well the score on the measurement procedure used as the dependent measure represents the construct of interest. A detailed discussion of the reliability of measurement entails some complexities that can be seen intuitively if one compares how accurately one could measure two different dependent measures. If 14114
the experimenter were measuring the occurrence of head-banging behavior in an institutionalized subject, frequency counts would be relatively easy to note accurately once there was some agreement by raters on what constituted an instance of the target behavior. There would still be some error, but it would be less than if one were trying to measure a dependent measure such as enthusiasm in a classroom. The construct of enthusiasm will be less easily characterized and have more error associated with its measurement and therefore be more likely to have higher variability. As single-case designs are applied to new domains of dependent measures such as psychotherapy research, researchers are increasingly attentive to traditional psychometric issues, including internal and external validity as well as to reliability. Traditionally, if reliability was considered in early applications of single-case designs, the usual method was to report interrater reliability as the percent agreement between two raters rating the dependent measure (Page and Iwata 1986). Often these measures were dichotomously scored counts of behaviors (e.g., a behavior occurred or did not occur). Simple agreement rates can produce inflated estimates of reliability. More sophisticated measures of reliability are being applied to single-case design research including kappa (Cohen 1960) and variations of intraclass correlation coefficients (Shrout and Fleiss 1979) to name a few. 4.3 Statistical Interpretation of Results While the traditional method of analyzing the results of single-case design research has been visual inspection of graphical representation of the data, there is a growing awareness of the utility of statistical verification of assertions of a treatment effect (Kruse and Gottman 1982). Research has shown that relying solely on visual inspection of results (particularly
Single-case Experimental Design in Clinical Settings when using response guided experimentation) may result in high Type I error rates (incorrectly concluding that a treatment effects exists when it does not). However, exactly how to conduct statistical analyses of time-series data appropriately is controversial. If the data points in a particular series were individual observations from separate individuals, then traditional parametric statistics including the analysis of variance, t-tests, or multiple regression could all be used if the appropriate statistical assumptions were met (e.g., normally distributed scores with equal variances). All of these statistical methods are variations of the general linear model. The most significant assumption in the general linear model is that residual errors are normally distributed and independent. A residual score is the difference between a particular score and the score predicted from a regression equation for that particular value of the predictor variable. The simple correlation between a variable and its residual has an expected value of 0. In singlecase designs, it can be the case that any particular score for an individual is related to (correlated with) the immediately preceding observation. If scores are correlated with their immediately preceding score (or following score), then they are said to be serially dependent or autocorrelated. Autocorrelation violates a fundamental principle of traditional parametric statistics. An autocorrelation coefficient is a slightly modified version of the Pearson correlation coefficient (r). Normally r is calculated using pairs of observations (Xi,Yi) for several subjects. The autocorrelation coefficient (ρk) is the correlation between a score or observation and the next observation (Yi,Yi+ ). When " this correlation is calculated between an observation and the next observation, it is called a lag 1 autocorrelation. The range of the autocorrelation coefficient is from k1.0 to j1.0. If the autocorrelation is 0, it indicates that there is no serial dependence in the data and the application of traditional statistical procedures can reasonably proceed. As ρk increases, there is increasing serial dependence. As ρk becomes negative, it indicates that there is a rapid cyclicity pattern. A lag can be extended out past 1 and interesting patterns of cyclicity may emerge. In economics, a large value of ρk at a lag 12 can indicate an annual pattern in the data. In physiological psychology, lags at 12 or 24 may indicate a systematic diurnal or circadian rhythm in the data. Statistical problems emerge when there is an autocorrelation among the residual scores. As the value of ρk increases (i.e., there is a positive autocorrelation) for the residuals, the computed values of many traditional statistics are inflated, and type I errors occur at a higher than expected rate. Conversely, when residuals show a negative autocorrelation, the resulting statistics will be too small producing misleading significance testing in the opposite direction. Suggestions for how to avoid problems of autocorrelation
include the use of distribution-free statistics such as randomization or permutation tests. However, if the residuals are autocorrelated even these approaches may not give fully accurate probability estimates (Gorman and Allison 1996, p. 172). There are many proposals for addressing the autocorrelation problem. For identifying patterns of cyclicity, even where cycles are embedded within other cycles, researchers have made use of spectral analysis techniques, including Fourier analysis. These are useful where there may be daily, monthly, and seasonal patterns in data such as one might see in changes in mood over long periods of observation. The most sophisticated of these approaches is called the autoregressive integrated moving average analysis or ARIMA method (Box and Jenkins 1976). This is a regression approach that tries to model the value of particular observation as a function of the parameters estimating the effects of past observations (φ), moving averages of the influence of residuals on the series (θ), and the removal of systematic trends throughout the entire time series. The computation of ARIMA models is a common feature of most statistical computer programs. Its primary limitation is that it requires many data points, often more than are practical in applied or clinical research settings. Additionally, there may be several ARIMA models that fit any particular data set, making results difficult to interpret. Alternative simplifications of regression-based analyses are being studied to help solve the interpretive problems of using visual inspection alone as a means of identifying treatment effects (Gorman and Allison 1996).
4.4 Generalization of Findings and Meta-analysis Although single-case designs can allow for convincing demonstrations of experimental control in a particular case, it is difficult to know how to generalize from single-case designs to other contexts. Single-case design researchers are often interested in specific effects, in a specific context, for a specific subject. Group designs provide information about average effects, across a defined set of conditions, for an average subject. Inferential problems exist for both approaches. For the single-case design approach, the question is whether results apply to any other set of circumstances. For the group design approach, the question is whether the average result observed for a group applies to a particular individual in the future or even if anyone in the group itself showed an average effect. In the last two decades of the twentieth century, meta-analysis has become used as a means of aggregating results across large numbers of studies to summarize the current state of knowledge about a particular topic. Most of the statistical work has focused on aggregating across group design studies. 14115
Single-case Experimental Design in Clinical Settings More recently, meta-analysts have started to address how to aggregate the results from single-case design studies in order to try to increase the generalizability from this type of research. There are still significant problems in finding adequate ways to reflect changes in slopes (trends) between phases as an effect size statistic, as well as ways to correct for the effects of autocorrelation in meta-analysis just as there are for the primary studies upon which a meta-analysis is based. There is also discussion of how and if singlecase and group design results could be combined (Panel on Statistical Issues and Opportunities for Research in the Combination of Information, 1992).
5. Summary Technical issues aside, the difference between the goal of precision and control typified in single-case design versus the interest in estimates of overall effect sizes that are characteristic of those who favor group designs makes for a lively discussion about the importance of aggregation across studies for the purpose generalization. With analytic methods for single-case design research becoming more sophisticated, use of this research strategy to advance science is likely to expand as researchers search for large effects under well specified conditions. See also: Behavior Analysis, Applied; Case Study: Logic; Case Study: Methods and Analysis; Caseoriented Research; Experimental Design: Overview; Experimental Design: Randomization and Social Experiments; Experimenter and Subject Artifacts: Methodology; Hypothesis Testing: Methodology and Limitations; Laboratory Experiment: Methodology; Medical Experiments: Ethical Aspects; Psychotherapy: Case Study; Quasi-Experimental Designs; Reinforcement, Principle of; Single-subject Designs: Methodology
Bibliography Barlow D H, Hayes S C 1979 Alternating treatments design: One strategy for comparing the effects of two treatments in a single subject. Journal of Applied Behaior Analysis 12: 199–210 Box G E P, Jenkins G M 1976 Time Series Analysis, Forecasting, and Control. Holden-Day, San Francisco Cohen J A 1960 A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20: 37–46 Franklin R D, Allison D B, Gorman B S 1997 Design and Analysis of Single-case Research. Lawrence Erlbaum Associates, Marwah NJ Franklin R D, Gorman B S, Beasley T M, Allison D B 1996 Graphical display and visual analysis. In: Franklin R D, Allison D B, Gorman B S (eds.) Design and Analysis of Singlecase Research. Lawrence Erlbaum Associates, Marwah NJ, pp. 119–58
14116
Gorman B S, Allison D B 1996 Statistical alternatives for singlecase designs. In: Franklin R D, Allison D B, Gorman B S (eds.) Design and Analysis of Single-case Research. Lawrence Erlbaum Associates, Marwah NJ, pp. 159–214 Hayes S C, Barlow D H, Nelson-Gray R O 1999 The Scientistpractitioner: Research and Accountability in the Age of Managed Care, 2nd edn. Allyn and Bacon, Boston Hersen M, Barlow D H 1976 Single Case Experimental Designs: Strategies for Studying Behaior Change. Pergamon Press, New York Kazdin A E 1982 Single Case Research Designs: Methods for Clinical and Applied Settings. Oxford University Press, New York Kruse J A, Gottman J M 1982 Time series methodology in the study of sexual hormonal and behavioral cycles. Archies of Sexual Behaior 11: 405–15 Page T J, Iwata B A 1986 Interobserver agreement: History, theory, and current methods. In: Poling A, Fuqua R W (eds.) Research Methods in Applied Behaior Analysis: Issues and Adances. Plenum, New York, pp. 99–126 Panel on Statistical Issues and Opportunities for Research in the Combination of Information 1992 Contemporary Statistics: Statistical Issues and Opportunities for Research. National Academy Press, Washington, DC, Vol. 1 Parsonson B S, Baer D M 1978 The analysis and presentation of graphic data. In: Kratochwill T R (ed.) Single-subject Research: Strategies for Ealuating Change. Academic Press, New York, pp. 101–65 Poling A, Grossett D 1986 Basic research designs in applied behavior analysis. In: Poling A, Fuqua R W (eds.) Research Methods in Applied Behaior Analysis: Issues and Adances. Plenum, New York Shrout P E, Fleiss J L 1979 Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin 86: 420–8
W. C. Follette
Single-subject Designs: Methodology Science is about knowledge and understanding and is fundamentally a human enterprise. Adequate methodology is at the heart of scientific activity for it allows scientists to answer fundamental epistemic questions about what is known or knowable about their subject matter. This includes the validity of data-based conclusions, the generality of findings, and how knowledge may be used to achieve practical goals. Choice of methodology is critical in answering such questions with precision and scope, and the adequacy of a given methodology is conditional upon the purposes for which it is put. The purpose of this chapter is to provide a succinct description of a methodological approach well suited for work with single organisms, whether in basic research or applied contexts, and particularly where large N designs or inferential statistics are nonsensical and even inappropriate. This approach, referred to here as ‘single-subject design’ methodology, has a long history in the medical,
Single-subject Designs: Methodology behavioral, and applied sciences and includes several design options, from the simple to the highly sophisticated and complex. As will be seen, all such designs involve a rigorous and intensive experimental analysis of a single case, and often replications across several individual cases. No attempt will be made here to describe all available design options, including available statistical tests, as excellent resources are available on both subjects (see Barlow and Hersen 1984, Busk and Marascuilo 1992, Hayes et al. 1999; Huitema 1986, Iversen 1991, Kazdin 1982, Parsonson and Baer 1992). Rather, the intent is to provide (a) an overview of single-subject design methodology and the rationale guiding its use in experimental and applied contexts; (b) a succinct description of the main varieties of single-subject designs, including basic guidelines for their application; and (c) to highlight recent trends in the use of such designs in applied contexts as a means to increase accountability.
100 rats for 10 hours each, the investigator is likely to study one rat for a 1000 hours’ (p. 21); an approach that served as the foundation for operant learning (Skinner 1953, 1957) and one that, with the increasing popularity of behavior therapy, set the stage for a new approach to treatment development, treatment evaluation, and increased accountability (Barlow and Hersen 1984). Though single-subject research often involves more than one subject, the data are evaluated primarily on a case-by-case basis. The idiographic nature of this methodology is well suited for applied work where practitioners often attempt to influence and change problematic behavior(s) of individual clients (Hayes et al. 1999). Though not a requirement, the inclusion of more than one subject helps establish the robustness of observed effects across differentially imposed conditions, individuals, behaviors, settings, and\or time. Hence, direct or systematic replication serves to increase confidence in the observed effects and helps establish the reliability and generality of observed relations (Sidman 1960).
1. Single-subject Design: Oeriew and Rationale At the core, all scientists ultimately deal with particulars, whether cells, atoms, microbes, or the behavior of organisms. It is from such particulars that the facts of science and generality of findings are derived (Sidman 1960). This is true whether one aggregates a limited set of observations from a large sample of cases into groups that differ on one or more variables, or situations involving a few cases and a large number of observations of each over time (see also Hilliard 1993). The former is characteristic of hypothetico-deductive group design strategies that rely on inferential statistics to support scientific claims, whereas the latter idiographic and inductive approach is characteristic of single-subject methodology. As with other research designs, single-subject methodology is as much an approach to science as it is a set of methodological rules and procedures for answering scientific and practical questions. As an approach, single-subject designs have several distinguishing features.
1.1 Subject Matter of Single-subject Designs Perhaps the most notable feature of single-subject methodology is its subject matter and sample characteristics; namely, the intensive study of the behavior of single (i.e., N l 1) organisms. This feature is also a source of great misunderstanding (Furedy 1999), particularly with regard to the generality and scientific utility of research based on an N of 1. It should be stressed that single-subject methodology gets its name not from sample size per se, but rather because the unit of analysis is behavior of individuals (Perone 1991), or what Gottman (1973) referred to as ‘N-of-one-at-atime’ designs. Skinner (1966) described it this way: ‘… instead of studying 1000 rats for one hour each, or
1.2 Analytic Aims of Single-subject Designs: Prediction and Influence (Control) As with other experimental design strategies, the fundamental premise driving the use of single-subject methodology is to demonstrate control (i.e., influence) via the manipulation of independent variables and analysis of the effects of such variables on behavior (dependent variables). This emphasis on systematic manipulation of independent variables distinguishes single-subject designs from other small N research such as case reports, case studies, and the like which are often lucidly descriptive, but less controlled and systematic. With few exceptions (see Perone 1991), the general convention of single-subject design is to change one variable at a time and to evaluate such effects on behavior. This ‘one-variable-at-a-time’ rule positions the basic or applied researcher to make clear causal statements about the effects of independent variables compared with a previous state (e.g., naturally occurring behavior during baseline), to compare different independent variables across or within conditions, and only then to evaluate interaction effects produced by two or more independent variables together. Indeed, single-subject methodology is unique in that it typically involves an intensive and rigorous experimental analysis of the behavior (i.e., thinking, emotional responses, overt motor acts) across time, and an equally rigorous attempt to isolate and control extraneous sources of variability (e.g., maturation, history) that may confound and\or mask any effects caused by systematic changes in the independent variable(s) across conditions and time (Sidman 1960). This point regarding the handling of extraneous sources of variability cannot be overstated. Iversen 14117
Single-subject Designs: Methodology (1991) and others (Barlow and Hersen 1984: Sidman 1960) have noted issues concerning how variability is handled in-group design research and single-subject methodology. With regard to such issues, Iverson put it this way: [in-group design research] the fit between theory and observation often is evaluated by means of statistical tests, and variability in data is considered a nuisance rather than an inspiration …. By averaging over N subjects the variability among the subjects is ignored N times. In other words, a data analysis restricted to only the mean generates N degrees of ignorance ... Single-subject designs are used because the data based on the individual subject are more accurate predictors of that individual’s behavior than are data based on an averaging for a group of subjects (p. 194).
Though the issue of when and how to aggregate data (if at all) is controversial in single-subject research (Hilliard 1993), there is a consensus that variability is to be pinpointed and controlled to the extent possible. This is consistent with the view that variability is imposed and determined from without, and not an intrinsic property of the subject matter under study (Sidman 1960). Efforts are deliberately made to control such noise so that prediction and control can be achieved with precision and scope. Consequently, those employing experimental single-subject design methods are more likely to commit Type II errors (i.e., to deny a difference and claim that a variable is not exerting control over behavior, when it is) compared with more classic Type I errors (i.e., to claim that a variable is exerting control over behavior, when it is not; (see Baer 1977).
Perone 1991). Such changes, in turn, are often evaluated visually and graphically relative to a prior less controlled state (i.e., baseline), or adjacent conditions involving another independent variable. In singlesubject methodology, such stability can be evaluated within a condition (i.e., from one sample or observation to the next) and across conditions (i.e., changes in level, trend, and variability). The pragmatic appeal of such an approach, particularly in applied contexts, rests in allowing the practitioner to evaluate the effectiveness of imposed interventions in real time, and consequently to modify their intervention tactics as the data dictate. Indeed, unlike more formal group designs that are followed strictly once set in place, single-subject methodology is more flexible. Indeed, it is common for design elements to be added, dropped, and\or modified as the data dictate so as to meet scientific and applied goals. This feature has obvious parallels with how practitioners work with their clients in designing treatment interventions (see also Hayes et al. 1999). In sum, all single-subject designs focus on individual organisms and aim to predict and influence behavior via: (a) repeated sampling and observation; (b) manipulation of one or more independent variables while isolating and controlling sources of extraneous variability to the extent possible; and (c) demonstration of stability within and across levels of imposed independent variables (see Perone 1991). Discussion will now turn to an enumeration of such features in the context of more popular single-subject designs.
2. Varieties of Single-subject Methodology 1.3 Single-subject Design: Measurement Issues Single-subject design methodology also can be distinguished by the frequency with which behavior is sampled within conditions, and across differentially imposed conditions and time. It is common, for instance, for basic and applied researchers to include hundreds of samples of behavior across time (i.e., hours, days, weeks, and sessions) with individual subjects. Yet, a minimum of at least three data points is required to establish level, trend, and variability within a given phase or design element (Barlow and Hersen 1984, Hayes et al. 1999). Changes in level, trend, and at times variability from one condition to the next, and particularly changes that are robust, clear, immediate, and stable from behavior observed in a prior condition, help support causal inferences. Here, stability is a relative term and should not be confused with something that is static or flat. Rather, stability provides a background by which to evaluate the reliability of changes produced by an independent variable across conditions and time. By convention, each condition is often continued until some semblance of stability is observed, at which point another condition is imposed, and behavior is evaluated relative to the previous condition or steady state (cf. 14118
Most single-subject designs can be generally classified as representing two main types: within-series and between-series designs. Though each will be discussed briefly in turn, it is important to recognize that neither class of design precludes combining elements from the other (e.g., combined-series elements). In other words, a within-series design may, for some purposes, lead the investigator to add between-series elements and viceversa, including other more complex elements (e.g., interaction effects, changing criterion elements). 2.1 Within-series Designs The basic logic and structure of within-series designs are simple; namely, to evaluate changes within a series of data points across time on a single measure or set of related measures (see Hayes et al. 1999). Stability is judged for each data point relative to other data points that immediately precede and follow it. By convention, such designs include comparisons between a naturally occurring state of behavior (denoted by the letter A) and the effects of an imposed manipulation of an independent variable or intervention (denoted by different letters such as B, C, and so on). A simple AB design involves sampling naturally occurring beha-
Single-subject Designs: Methodology vior (A phase), followed by repeated assessment of responding when the independent variable is introduced (B phase). For example, suppose a researcher wanted to determine the effects of rational-emotive therapy on initiating social interactions for a particular client. If an A-B design were chosen, the rate of initiations prior to treatment would be compared with that following treatment in a manner analogous to popular pre-to-post comparisons in-group outcome research. It should be noted, however, that A-B designs are inherently weak in controlling for threats to internal validity (e.g., testing, maturation, history, regression to the mean). Withdrawal\reversal designs control for such threats, and hence provide a more convincing case of experimental control. Withdrawal designs represent a replication of the basic A-B sequence a second time, with the term withdrawal representing the imposed removal of the active treatment or independent variable (i.e., a return to second baseline or A phase). For example, a simple A-B-A sequence allows one to evaluate treatment effects relative to baseline responding. If an effect due to treatment is present, then it should diminish once treatment is withdrawn and the subject or client is returned to an A phase. Other variations on withdrawal designs include B-A-B designs and A-B-A-B (reversal) designs (see Barlow and Hersen 1984, Hayes et al. 1999). A-B designs and withdrawal\reversal designs are typically used to compare the effects of a finite set of treatment variables with baseline response levels. Data regarding stable response trends are collected across several discrete periods (e.g., time, sessions), wherein the independent variable is either absent (baseline) or present (treatment phase). Phase shifts are data-driven, and response stability determines the next element added to the design, elements that may include other manipulations or treatments either alone (e.g., A-B-A-C-A-B-A) or in combination (e.g., A-B-A-BjC-A-B-A). As the manipulated behavior change is repeatedly demonstrated and replicated across increasing numbers of phase shifts and time, confidence in the role of the independent variable as a cause of such changes increases. This basic logic of within-series single-subject methodology has been expanded in sophisticated and at times complex ways to meet basic and applied purposes. For instance, such designs can be used to test the differential effects of more than one treatment. Other reversal designs (i.e., B-C-B-C) involve the comparison of differential, yet consecutive, treatment interventions across multiple phase shifts. These designs are similar to the above reversals in how control is evaluated, but differ primarily in that baseline phases are not required. Designs are also available that combine features of withdrawal and reversal. For example, A-B-A-B-C-B designs allow one to compare A and B phases with each other (reversal; A-B-A-B), and a second treatment to B (reversal; e.g., B-C-B), or even to evaluate the extent to which behavior tracks a
specified behavioral criterion (i.e., changing criterion designs; see Hayes et al. 1999). Changing criterion designs provide an alternative method for experimentally analyzing behavioral effects without subsequent treatment withdrawal. Criterion is set such that optimal levels given exposure can be met and then systematically increased (or decreased) to instill greater demand on acquiring new repertoires. For example, a child learning to add may be required to calculate 4 of 10 problems correctly to earn a prize. Once the child successfully and consistently demonstrates this level of responding, the demand increases to 6 of 10 correct problems. Changing criterion designs serve as a medium to demonstrate learning through achieving successive approximations of the end-state.
2.2 Between-series Designs Between-series designs differ primarily from withinseries designs in that data are first grouped and compared by condition, and then by time, whereas the reverse is true for within-series designs. Between-series designs need not contain phases, as evaluation of level, trend, and stability are organized by condition first, but not by time alone (Hayes et al. 1999, p.176). Designs within this category include ‘alternatingtreatment design’ and ‘simultaneous-treatment design.’ The basic logic of both is the same, in that a minimum of two treatments are evaluated concurrently in an individual case. With alternating-treatments design, the concurrent evaluation is made between rapid and largely random alternations of two or more conditions (Barlow and Hayes 1979). Unlike withinseries designs, alternating-treatment designs contain no phases (Hayes et al. 1999). The same is true for simultaneous-treatment designs; a design that is appropriate for situations where one wishes to evaluate the concurrent or simultaneous application of two or more treatments in a single case. Rapid or random alteration of treatment is not required with simultaneous-treatment design. What is necessary is that two or more conditions are simultaneously available, with the subject choosing between them. This particular design is not conducive for evaluating treatment outcome, but is appropriate for evaluating preference or choice (see Hayes et al. 1999). Alternating-treatment designs, by contrast, fit well with applied work, as therapists must often routinely target multiple problems concurrently, and thus need to switch rapidly between intervention tactics in the context of therapy.
2.3 Combining Within- and Between-series Elements: Multiple-baseline Design Multiple-baseline designs build upon and integrate the basic logic and structure of within- and 14119
Single-subject Designs: Methodology between-series elements. The fundamental premise of multiple-baseline designs is to replicate phase change effects systematically in more than one series, with each subsequent uninterrupted series serving as a control condition for the preceding interrupted series. Series can be compared and arranged across behaviors, across settings, across individuals, or some combination of these (see Hayes et al. 1999). Such designs require that the series under consideration be independent (e.g., two functionally distinct behaviors), and that intervention be administered sequentially beginning with the first series, while others are left uninterrupted as controls. For example, a client might present with three distinct behavior problems all requiring exposure therapy. All behaviors would be monitored during baseline (A phase), and after some semblance of stability is reached the treatment (B phase) would be applied to the first behavior series, while the remaining two behaviors are continuously monitored in an extended baseline. Once changes in the first behavior resulting from treatment reach stability, the second series would be interrupted and treatment applied, while the first behavior continues in the B phase and the third behavior is monitored in an extended baseline. The procedure followed for the first two series elements is then repeated for the third behavior. This logic can be similarly applied to multiple-baseline designs across individuals or settings (see Barlow and Hersen 1984, Hall et al.). More complex within-series elements (e.g., A-B-A-B, or B-C-B, or counterbalanced phases such as B-A-B and A-B-A) can be evaluated across behaviors, settings, and individuals. Note that with multiple-baseline designs the treatment is never withdrawn, rather, it is introduced systematically across a set of dependent variables. Independent variable effects are denoted from how reliably behavior change correlates with the onset of treatment for a particular dependent variable. For such reasons, multiple-baseline designs are quite popular, owing much to their ease of use, strength in ruling out threats to internal validity, built-in replication, and fit with the demands of applied practitioners working in therapy, schools, and institutions. Indeed, such designs can be useful when therapists are working with several clients presenting with similar problems, when clients begin therapy at different times, or in cases where targets for change in therapy occur sequentially.
3. Recent Trends in the Application of Singlesubject Methods Single-subject methodology has a long historic affiliation with basic and applied behavioral research, and the behavior therapy movement more generally (Barlow and Hersen 1984). Though the popularity of 14120
single-subject methods in published research appearing in mainstream behavior therapy journals appears to be on the decline relative to use of group design methodology (Forsyth et al. 1999), there does appear to be a resurgence of interest in single-subject methodology in applied work. There are several possible reasons for this renewed interest, but only two are mentioned here. First, trends in treatment-outcome research have increasingly relied on the now popular ‘randomized clinical trial’ group design methodology to establish empirical support for psychosocial therapies for specific behavioral disorders. Practitioners, who work predominantly with individual clients, are quick to point out that group-outcome data favoring a given intervention, however convincing statistically speaking, includes individuals in the group that did not respond to therapy. Moreover, the ‘average’ group response to treatment may not generalize to the specific response of an individual client to that treatment. Thus, practitioners have been skeptical of how such research informs their work with individual clients. The second, and perhaps more important, reason for the renewed interest in single-subject methodology is driven by pragmatic concerns and the changing nature of the behavioral health care marketplace. Increasingly, third-party payers are requiring practitioners to demonstrate accountability. That is, to show that what they are doing with their clients is achieving the desired effect (i.e., good outcomes) in a cost-effective and lasting manner. Use of single-subject methodology has a place in assisting practitioners in making clinical decisions based on empirical data, in demonstrating accountability to third party payers and consumers. Further, such methodology, though requiring time and sufficient training to implement properly, is atheoretical and fits nicely with how practitioners from a variety of conceptual schools work routinely with their individual clients (Hayes et al. 1999). Thus, there is great potential in single-subject methodology for bridging the strained scientist–practitioner gap, and ultimately advancing the science of behavior change and an empiricallydriven approach to practice and treatment innovation.
4. Summary and Conclusions Methodology is a way of knowing. Single-subject methodology represents a unique way of knowing that has as its subject matter an experimental analysis of the behavior of individual organisms. Described here were the assumptions driving this approach and the main varieties of design options available to address basic experimental and applied questions. Perhaps the greatest asset of single-subject methodology rests with the flexibility with which such designs can be construc-
Situated Cognition: Contemporary Deelopments ted to meet the joint analytic goals of prediction and control over the behavior of individual organisms. This asset also represents one of the greatest liabilities of such designs in that flexibility requires knowledge of when and how to modify and\or add design elements to achieve analytic goals, and particularly skill in recognizing significant effects, sources or variability, and when one has demonstrated sufficient levels of prediction and control over the behavior of interest. See also: Case Study: Logic; Case Study: Methods and Analysis; Psychotherapy: Case Study; Single-case Experimental Designs in Clinical Settings
T R, Levin J R (eds.) Single-Case Research Design and Analysis. Plenum, New York, pp. 15–41 Perone M 1991 Experimental design in the analysis of freeoperant behavior. In: Iversen I H, Lattal K A (eds.) Experimental Analysis of Behaior, pt 1. Elsevier, New York, pp. 135–71 Sidman M 1960 Tactics of Scientific Research: Ealuating Experimental Data in Psychology. Basic Books, New York Skinner B F 1953 Science and Human Behaior. Macmillan, New York Skinner B F 1957 The experimental analysis of behavior. American Scientist 45: 343–71 Skinner B F 1966 Operant behavior. In: Honig W K (ed.) Operant Behaior: Areas of Research and Application. Appleton-Century-Crofts, New York, pp. 12–32
J. P. Forsyth and C. G. Finlay
Bibliography Baer D M 1977 Perhaps it would be better not to know everything. Journal of Applied Behaior Analysis 10: 167–72 Barlow D H, Hayes S C 1979 Alternating treatments design: One strategy for comparing the effects of two treatments in a single subject. Journal of Applied Behaior Analysis 12: 199–210 Barlow D H, Hersen M 1984 Single-Case Experimental Designs: Strategies for Studying Behaior Change, 2nd edn. Allyn & Bacon, Boston, MA Busk P L, Marascuilo L A 1992 Statistical analysis in single-case research: Issues, procedures, and recommendations, with applications to multiple behaviors. In: Kratochwill T R, Levin J R (eds.) Single-Case Research Design and Analysis. Plenum, New York, pp. 159–85 Forsyth J P, Kollins S, Palav A, Duff K, Maher S 1999 Has behavior therapy drifted from its experimental roots? A survey of publication trends in mainstream behavioral journals. Journal of Behaior Therapy and Experimental Psychiatry 30: 205–20 Furedy J J 1999 Commentary: On the limited role of the singlesubject design in psychology: Hypothesis generating but not testing. Journal of Behaior Therapy and Experimental Psychiatry 30: 21–2 Gottman J M 1973 N-of-one and N-of-two research in psychotherapy. Psychological Bulletin 80: 93–105 Hall R V, Cristler C, Cranston S S, Tucker B 1970 Teachers and parents as researchers using multiple baseline designs. Journal of Applied Behaior Analysis 3: 247–55 Hayes S C, Barlow D H, Nelson-Gray R O 1999 The ScientistPractitioner: Research and Accountability in the Age of Managed Care. Allyn & Bacon, Boston, MA Hilliard R B 1993 Single-case methodology in psychotherapy process and outcome research. Journal of Consulting and Clinical Psychology 61: 373–80 Huitema B E 1986 Statistical analysis and single-subject designs: Some misunderstandings. In: Poling A, Fuqua R W (eds.) Research Methods in Applied Behaior Analysis. Plenum, New York, pp. 209–32 Iversen I H 1991 Methods of analyzing behavior patterns. In: Iversen I H, Lattal K A (eds.) Experimental Analysis of Behaior, pt 2. Elsevier, New York, pp. 193–241 Kazdin A E 1982 Single-Case Research Designs: Methods for Clinical and Applied Settings. Oxford University Press, New York Parsonson B S, Baer D M 1992 The visual analysis of data, and current research into the stimuli controlling it. In: Kratochwill
Situated Cognition: Contemporary Developments Situate analyses of cognition that draw on the substantive aspects of Vygotsky’s and Leont’ev’s work (see Situated Cognition: Origins) did not become prominent in Western Europe and North America until the 1980s. Contemporary situated perspectives on cognition can be divided into two broad groups. One of these, cultural historical activity theory (CHAT), has developed largely independently of mainstream Western psychology by drawing inspiration directly from the writings of Vygotsky and Leont’ev. The other group, which I will call distributed cognition, has developed in reaction to mainstream cognitive science and incorporates aspects of the Soviet work.
1. Cultural Historical Actiity Theory It is important to clarify that the term ‘activity’ in the acronym in CHAT refers to cultural activity or, in other words, to cultural practice. The reference to history indicates a focus on both the evolution of the cultural practices in which people participate and the development of their thinking as they participate in them. Olson’s (1995) analysis of the historical development of writing systems is paradigmatic of investigations that focus on the evolution of cultural practices at what might be termed the macrolevel of history writ large. Olsen’s goal was not merely to document the first appearance of and subsequent changes in writing systems. Instead, he sought to demonstrate that changes in writing systems precipitated changes in thought that in turn made possible further developments in both writing and thinking. In doing so, he elaborated Vygotsky’s claim that it was not until reasonably sophisticated writing systems had 14121
Situated Cognition: Contemporary Deelopments emerged that people became consciously aware of language as a means of monitoring and regulating thought. As Olsen has made clear, the findings of his analysis have significant implications for children’s induction into literacy practices in school. Saxe’s (1991) investigation of the body parts counting system of the Oksapmin people of Papua New Guinea provides a useful point of contrast in that it was concerned with cultural change at a more local level. As Saxe reports, the Oksapmin’s counting system has no base structure and no distinct terms for numbers. Instead, the Oksapmin count collections of items (e.g., sweet potatoes) by beginning with the left index finger and naming various body parts as they move up the left arm, across the head and shoulders, and down the right arm, ending with the right index finger. At the time that Saxe conducted his fieldwork, a new technology was being introduced, a base-10 currency system. Saxe documents that the Oksapmin with the greatest experience in using the currency, the owners of indigenous trade stores, had developed relatively sophisticated reasoning strategies that were based on the body parts system but that privileged 10 (e.g., transforming the task of adding nine and seven into that of adding ten and six by ‘moving’ a body part). Saxe’s finding is significant in that it illustrates a general claim made by adherents of CHAT, namely that changes in the artifacts people use, and thus the cultural practices in which they participate, serve to reorganize their reasoning. The two CHAT studies discussed thus far, those of Olsen and Saxe, investigated the evolution of cultural practices. A second body of CHAT research is exemplified by investigations that have compared mathematical reasoning in school with that in various out-of-school settings. Following both Vygotsky and Leont’ev, these investigations can be interpreted as documenting the fusion of the forms of reasoning that people develop with the cultural practices in which they participate. A third line of CHAT research has focused on the changes that occur in people’s reasoning as they move from relatively peripheral participation to increasingly substantial participation in the practices of particular communities. In their overview of this type of research, Lave and Wenger (1991) clarify that the cultural tools used by community members are viewed as carrying a substantial portion of a practice’s intellectual heritage. As Lave and Wenger note, this implies that novices’ opportunities for learning depend crucially on their access to these tools as they are used by the community’s old-timers. Lave and Wenger also make it clear that in equating learning with increasingly substantial participation, they propose to dispense with traditional cognitive analyses entirely. In their view, someone’s failure to learn should be explained in terms of the organization of the community and the person’s opportunities for access to increasingly substantial forms of partici14122
pation rather than in terms of cognitive deficits that are attributed to the person. This is clearly a relatively strong claim in that it equates an analysis of the conditions for the possibility of learning with an analysis of learning. It should be apparent from this overview that CHAT research spans a wide range of problems and issues. One theme that cuts across the three lines of work is a focus on groups of people’s reasoning, whether they be people using different writing systems in different historical epochs, Oksapmin trade store owners compared to Oksapmin who have less experience with the new currency, the mathematical reasoning of people in school and nonschool settings, or novices versus old-timers in a community of practice. Differences in the reasoning of people as they participate in the same practices are therefore accounted for in terms of either (a) experience in participating in the practice, (b) access to more substantial forms of participation, or (c) differences in the history of their participation in other practices. Each of these types of explanations instantiates Leont’ev’s dictum that the individual-in-cultural-practice rather than the individual per se is the appropriate unit of analysis.
2. Distributed Cognition Whereas CHAT research often involves comparisons of groups of people’s reasoning, work conducted within the distributed cognition tradition typically focuses on the reasoning of individuals or small groups of people as they solve specific problems or complete specific tasks. Empirical studies conducted within this tradition therefore tend to involve detailed microanalysis of either an individual’s or a small group’s activity. Further, whereas CHAT researchers often frame people’s reasoning as acts of participation in relatively broad systems of cultural practices, distributed cognition theorists typically restrict their focus to the immediate physical, social, and symbolic environment. This circumscription of analyses to people’s reasoning in their immediate environments is one indication that this tradition has evolved from mainstream cognitive science. Several of the leading scholars in this tradition, such as John Seeley Brown, Alan Collins, and James Greeno, initially achieved prominence within the cognitive science community before substantially modifying their theoretical commitments, in the process contributing to the emergence of a distributed perspective on cognition. The term ‘distributed intelligence’ or ‘distributed cognition’ is most closely associated with Roy Pea (1993). Pea coined this term to emphasize that, in his view, cognition is distributed across minds, persons, and symbolic and physical environments. As he and other distributed cognition theorists make clear in their writings, this perspective directly challenges a
Situated Cognition: Contemporary Deelopments foundational assumption of mainstream cognitive science. This is the assumption that cognition is bounded by the skin or by the skull and can be adequately accounted for solely in terms of people’s construction of internal mental models of an external world. Distributed cognition theorists instead see cognition as extending out into the immediate environment such that the environment becomes a resource for reasoning. In coming to this conclusion, distributed cognition theorists have been influenced by a number of studies conducted by CHAT researchers, one of the most frequently cited investigations being that of Scribner (1984). In this investigation, Scribner analyzed the reasoning of workers in a dairy as they filled orders by packing products into crates of different sizes. Her analysis revealed that the loaders did not perform purely mental calculations but instead used the structure of the crates as a resource in their reasoning. For example, if an order called for 10 units of a particular product and six units were already in a crate that held 12 units, experienced loaders rarely subtracted six from 10 to find how many additional units they needed. Instead, they might realize that an order of 10 units would leave two slots in the crate empty and just know immediately from looking at the partially filled crate that four additional units are needed. As part of her analysis, Scribner convincingly demonstrated that the loaders developed strategies of this type in situ as they went about their daily business of filling orders. For distributed cognition theorists, this indicated that the system that did the thinking was the loader in interaction with a crate. From the distributed perspective, the loaders’ ways of knowing are therefore treated as emergent relations between them and the immediate environment in which they worked. Part of the reason that distributed cognition theorists attribute such significance to Scribner’s study and to other investigations conducted by CHAT researchers is that they capture what Hutchins (1995) refers to as ‘cognition in the wild.’ This focus on people’s reasoning as they engage in both everyday and workplace activities contrasts sharply with the traditional school-like tasks that are often used in mainstream cognitive science investigations. In addition to questioning whether people’s reasoning on school-like tasks constitutes a viable set of cases from which to develop general models of cognition, several distributed cognition theorists have also critiqued current school instruction. In doing so, they have broadened their focus beyond cognitive science’s traditional emphasis on the structure of particular tasks by drawing attention to the nature of the classroom activities within which the tasks take on meaning and significance for students. Brown et al. (1989) developed one such critique by observing that school instruction typically aims to teach students abstract concepts and general skills on the assumption that students will be able to apply
them directly in a wide range of settings. In challenging this assumption, Brown et al. argue that the appropriate use of a concept or skill requires engagement in activities similar to those in which the concept or skill was developed and is actually used. In their view, the well-documented finding that most students do not develop widely applicable concepts and skills in school is attributable to the radical differences between classroom activities and those of both the disciplines and of everyday, out-of-school settings. They contend that successful students learn to meet the teacher’s expectations by relying on specific features of classroom activities that are alien to activities in the other settings. In developing this explanation, Brown et al. treat the concepts and skills that students actually develop in school as relations between students and the material, social, and symbolic resources of the classroom environment. It might be concluded from the two examples given thus far, those of the dairy workers and of students relying on what might be termed superficial cues in the classroom, that distributed cognition theorists do not address more sophisticated types of reasoning. Researchers working in this tradition have in fact analyzed a number of highly technical, work-related activities. The most noteworthy of these studies is, perhaps, Hutchins’s (1995) analysis of the navigation team of a naval vessel as they brought their ship into San Diego harbor. In line with other investigations of this type, Hutchins argues that the entire navigation team and the artifacts it used constitutes the appropriate unit for a cognitive analysis. From the distributed perspective, it is this system of people and artifacts that did the navigating and over which cognition was distributed. In developing his analysis, Hutchins pays particular attention to the role of the artifacts as elements of this cognitive system. He argues, for example, that the cartographer has done much of the reasoning for the navigator who uses a map. This observation is characteristic of distributed analyses and implies that to understand a cognitive process, it is essential to understand how parts of that process have, in effect, been sedimented in tools and artifacts. Distributed cognition theorists therefore contend that the environments of human thinking are thoroughly artificial. In their view, it is by creating environments populated with cognitive resources that humans create the cognitive powers that they exercise in those environments. As a consequence, the claim that artifacts do not merely serve to amplify cognitive process but instead reorganize them is a core tenet of the distributed cognition perspective.
3. Current Issues and Future Directions It is apparent that CHAT and distributed cognition share a number of common assumptions. For example, both situate people’s reasoning within encom14123
Situated Cognition: Contemporary Deelopments passing activities and both emphasize the crucial role of tools and artifacts in cognitive development. However, the illustrations presented in the preceding paragraphs also indicate that there are a number of subtle differences between the two traditions. One concerns the purview of the researcher in that distributed cognition theorists tend to focus on social, material, and symbolic resources within the immediate local environment whereas CHAT theorists frequently locate an individual’s activity within a more encompassing system of cultural practices. A second difference concerns the way in which researchers working in the two traditions address the historical dimension of cognition. Distributed cognition theorists tend to focus on tools and artifacts per se, which are viewed as carrying the reasoning of their developers from prior generations. In contrast, CHAT theorists treat artifacts as one aspect of a cultural practice, albeit an important one. Consequently, they situate cognition historically by analyzing how systems of cultural practices have evolved. Thus, whereas distributed cognition theorists focus on the history of the cognitive resources available in the immediate environment, CHAT theorists contend that this environment is defined by a historically contingent system of cultural practices. Despite these differences, it is possible to identify several current issues that cut across the two traditions. The two issues I will focus on are those of transfer and of participation in multiple communities of practice.
3.1 Transfer As noted in the article Situated Cognition: Origins, the notion that knowledge is acquired in one setting and then transferred to other settings is central to the cognition plus view as well as to mainstream cognitive science. To avoid confusion, it is useful to differentiate between this notion of transfer as a theoretical idea and what might be termed the phenomenon called transfer. This term refers to specific instances of behavior that a cognition plus theorist would account for in terms of the transfer of knowledge from one situation to another. As will become clear, CHAT and distributed cognition theorists both readily acknowledge the phenomenon of transfer but propose different ways of accounting for it. An analysis developed by Bransford and Schwartz (1999) serves to summarize concerns about the traditional idea of transfer that underpins many cognitive science investigations. Writing from within the cognition plus view, Bransford and Schwartz (1999) observe that the theory underlying much cognitive science research characterizes transfer as the ability to apply previous learning directly to a new setting or problem. As they note, transfer investigations are designed to ensure that subjects do not have the opportunity to learn to solve 14124
new problems either by getting feedback or by using texts and colleagues as resources. Although Bransford and Schwartz indicate that they consider this traditional perspective valid, they also argue for a broader conception of transfer that includes a focus on whether people are prepared to learn to solve new problems. In making this proposal, they attempt to bring the active nature of transfer to the fore. As part of their rationale, they give numerous examples to illustrate that people often learn to operate in a new setting by actively changing the setting rather than by mapping previously acquired understandings directly on to it. This leads them to argue that cognitive scientists should change their focus by looking for evidence of useful future learning rather than of direct application. Bransford and Schwartz’s proposal deals primarily with issues of method in that it does not challenge transfer as a theoretical idea. Nonetheless, their emphasis on preparation for future learning is evident in Greeno and MMAP’s (1998) distributed analysis of the phenomenon of transfer. Greeno and MMAP contend that the ways of knowing that people develop in particular settings emerge as relations between them and the immediate environment. This necessarily implies that transfer involves active adaptation to new circumstances. As part of their theoretical approach, Greeno and MMAP analyze specific environments in terms of the affordances and constraints that they provide for reasoning. In the case of a traditional mathematics classroom, for example, the affordances might include the organization of lessons and of the textbook. The constraints might include the need to produce correct answers, the limited time available to complete sets of exercises, and the lack of access to peers and other resources. Greeno and MMAP would characterize the process of learning to be a successful student in such a classroom as one of becoming attuned to these affordances and constraints. From this distributed perspective, the phenomenon called transfer is then explained in terms of the similarity of the constraints and affordances of different settings rather than in terms of the transportation of knowledge from one setting to another. The focus on preparation for future learning advocated by Bransford and Schwartz (1999) is quite explicit in Beach’s (1995) investigation of the transition between work and school in a Nepal village where formal schooling had been introduced during the last 20 years of the twentieth century. Beach worked within the CHAT tradition when he compared the arithmetical reasoning of high school students who were apprentice shopkeepers with the reasoning of shopkeepers who were attending adult education classes. His analysis revealed that the shopkeepers’ arithmetical reasoning was more closely related in the two situations than was that of the students. In line with the basic tenets of CHAT, Beach accounted for this finding by framing the shopkeepers’ and students’ reasoning as acts of participation in relatively global
Situated Cognition: Contemporary Deelopments cultural practices, those of shopkeeping and of studying arithmetic in school. His explanation hinges on the observation that the students making the school-towork transition initially defined themselves as students but subsequently defined themselves as shopkeepers when they worked in a shop. In contrast, the shopkeepers continued to define themselves as shopkeepers even when they participated in the adult education classes. Their goal was to develop arithmetical competencies that would enable them to increase the profits of their shops. In Beach’s view, it is this relatively strong relationship between the shopkeepers’ participation in the practices of schooling and shopkeeping that explains the close relationship between their arithmetical reasoning in the two settings. Thus, whereas Greeno and MMAP account for instances of the phenomenon called transfer in terms of similarities in the affordances and constraints of immediate environments, Beach does so in terms of the experienced commensurability of certain forms of participation. In the case at hand, the phenomenon called transfer occurred to a greater extent with the shopkeepers because they experienced participating in the practice of shopkeeping and schooling as more commensurable than did the students. The contrast between Beach’s and Greeno and MMAP’s analyses illustrates how general differences between the distributed cognition and CHAT traditions can play out in explanations of the phenomenon called transfer. However, attempts to reconceptualize transfer within these two are still in their early stages and there is every indication that this will continue to be a major focus of CHAT and distributed cognition research in the coming years.
3.2 Participating in Multiple Communities of Practice To date, the bulk of CHAT research has focused on the forms of reasoning that people develop as they participate in particular cultural practices. For their part, distributed cognition theorists have been primarily concerned with the forms of reasoning that emerge as people use the cognitive resources of the immediate environment. In both cases, the focus has been on participation in well-circumscribed communities of practice or engagement in local systems of activity. An issue that appears to be emerging as a major research question is that of understanding how people deal with the tensions they experience when the practices of the different communities in which they participate are in conflict. The potential significance of this issue is illustrated by an example taken from education. A number of studies reveal that students’ home communities can involve differing norms of participation, language, and communication, some of which might be in conflict with those that the teacher seeks to
establish in the classroom. Pragmatically, these studies lead directly to concerns for equity and indicate the importance of viewing the diversity of the practices of students’ home communities as an instructional resource rather than an obstacle to be overcome. Theoretically, these studies indicate the importance of coming to understand how students attempt to resolve ongoing tensions between home and school practices. The challenge is therefore to develop analytical approaches that treat students’ activity in the classroom as situated not merely with respect to the immediate learning environment, but with the respect to their history of participation in the practices of particular out-of-school communities. The classroom would then be viewed as the immediate arena in which the students’ participation in multiple communities of practice play out in face-to-face interaction. This same general orientation is also relevant when attempting to understand people’s activity in a number of other specific settings. As this is very much an emerging issue, CHAT and distributed cognition research on participation in multiple communities of practice is still in its infancy. The most comprehensive theoretical exposition to date is perhaps that of Wenger (1998). There is every reason to predict that this issue will become an increasingly prominent focus of future research in both traditions.
4. Concluding Comments It should be apparent from this overview of contemporary developments in situated cognition that whereas CHAT researchers draw directly on Vygotsky’s and Leont’ev’s theoretical insights, the relationship to the Soviet scholarship is less direct in the case of distributed cognition research. Inspired in large measure by the findings of CHAT researchers, this latter tradition has emerged as a reaction to perceived limitations of mainstream cognitive science. As a consequence, distributed cognition theorists tend to maintain a dialog with their mainstream colleagues. In contrast, the problems that CHAT researchers view as significant are typically far less influenced by mainstream considerations. The issue addressed in the previous section of this overview was that of the possible future directions for the two research traditions. It is important to note that the focus on just two issues—transfer and participation in multiple communities—was necessarily selective. It would have been possible to highlight a number of other issues that include the learning of the core ideas of academic disciplines and the design of tools as cognitive resources. In both cases, it can legitimately be argued that the CHAT and distributed cognition traditions each have the potential to make significant contributions. More generally, it can legitimately be argued that both research traditions are in progressive phases at the present time. 14125
Situated Cognition: Contemporary Deelopments See also: Cultural Psychology; Learning by Occasion Setting; Situated Cognition: Origins; Situated Knowledge: Feminist and Science and Technology Studies Perspectives; Vygotskij, Lev Semenovic (1896–1934); Vygotskij’s Theory of Human Development and New Approaches to Education
Bibliography Beach K 1995 Activity as a mediator of sociocultural change and individual development: The case of school-work transition in Nepal. Mind, Culture, and Actiity 2: 285–302 Bransford J D, Schwartz D L 1999 Rethinking transfer: A simple proposal with multiple implications. Reiew of Research in Education 24: 61–100 Brown J S, Collins A, Duguid P 1989 Situated cognition and the culture of learning. Educational Researcher 18(1): 32–42 Greeno J G 1998 The situativity of knowing, learning, and research. American Psychologist 53: 5–26 Hutchins E 1995 Cognition in the Wild. MIT Press, Cambridge, MA Lave J, Wenger E 1991 Situated Learning: Legitimate Peripheral Participating. Cambridge University Press, Cambridge, UK Olson D R 1995 Writing and the mind. In: Wertsch J V, del Rio P, Alvarez A (eds.) Sociocultural Studies of Mind. Cambridge University Press, New York, pp. 95–123 Pea R D 1993 Practices of distributed intelligence and designs for education. In: Salomon G (ed.) Distributed Cognitions: Psychological and Educational Considerations. Cambridge University Press, New York, pp. 47–87 Saxe G B 1991 Culture and Cognitie Deelopment: Studies in Mathematical Understanding. Erlbaum, Hillsdale, NJ Scribner S 1984 Studying working intelligence. In: Rogoff B, Lave J (eds.) Eeryday Cognition: Its Deelopment in Social Context. Harvard University Press, Cambridge, MA, pp. 9–40 Wenger E 1998 Communities of Practice. Cambridge University Press, New York and Cambridge, UK
P. Cobb
Situated Cognition: Origins Situated cognition encompasses a range of theoretical positions that are united by the assumption that cognition is inherently tied to the social and cultural contexts in which it occurs. This initial definition serves to differentiate situated perspectives on cognition from what Lave (1991) terms the cognition plus view. This latter view follows mainstream cognitive science in characterizing cognition as the internal processing of information. However, proponents of this view also acknowledge that individual cognition is influenced both by the tools and artifacts that people use to accomplish goals and by their ongoing social interactions with others. To accommodate these insights, cognition plus theorists analyze the social world as a network of 14126
factors that influence individual cognition. Thus, although the cognition plus view expands the conditions that must be taken into account when developing adequate explanations of cognition and learning, it does not reconceptualize the basic nature of cognition. In contrast, situated cognition theorists challenge the assumption that social process can be clearly partitioned off from cognitive processes and treated as external condition for them. These theorists instead view cognition as extending out into the world and as being social through and through. They therefore attempt to break down a distinction that is basic both to mainstream cognitive science and to the cognition plus view, that between the individual reasoner and the world reasoned about.
1. Situation and Context I can further clarify this key difference between the situated and cognition plus viewpoints by focusing on the underlying metaphors that serve to orient adherents to each position as they frame questions and plan investigations. These metaphors are apparent in the different ways that adherents to the two positions use the key terms situation and context (Cobb and Bowers 1999, Sfard 1998). The meaning of these terms in situated theories of cognition can be traced to the notion of position as physical location. In everyday conversation we frequently elaborate this notion metaphorically when we describe ourselves and others as being positioned with respect to circumstances in the world of social affairs. This metaphor is apparent in expressions such as, ‘My situation at work is pretty good at the moment.’ In this and similar examples, the world of social affairs (e.g., work) in which individuals are considered to be situated is the metaphorical correlate of the physical space in which material objects are situated in relation to each other. Situated cognition theorists such as Lave (1988), Saxe (1991), Rogoff (1995), and Cole (1996) elaborate this notion theoretically by introducing the concept of participation in cultural practices (see Fig. 1). Crucially, this core construct of participation is not restricted to faceto-face interactions with others. Instead, all individual actions are viewed as elements or aspects of an
Figure 1 Metaphorical underpinnings of the cognition plus and situated cognition perspectives
Situated Cognition: Origins encompassing system of cultural practices, and individuals are viewed as participating in cultural practices even when they are in physical isolation from others. Consequently, when situated cognition theorists speak of context, they are referring to a sociocultural context that is defined in terms of participation in a cultural practice. This view of context is apparent in a number of investigations in which situated cognition theorists have compared mathematical reasoning in school with mathematical reasoning in various out-of-school settings such as grocery shopping (Lave 1988), packing crates in a dairy (Scribner 1984), selling candies on the street (Nunes et al. 1993, Saxe 1991), laying carpet (Masingila 1994), and growing sugar cane (DeAbreu 1995). These studies document significant differences in the forms of mathematical reasoning that arise in the context of different practices that involve the use of different tools and sign systems, and that are organized by different overall motives (e.g., learning mathematics as an end in itself in school vs. doing arithmetical calculations while selling candies on the street in order to survive economically). The findings of these studies have been influential, and challenge the view that mathematics is a universal form of reasoning that is free of the influence of culture. The underlying metaphor of the cognition plus view also involves the notion of position. However, whereas the core metaphor of situated theories is that of position in the world of social circumstances, the central metaphor of the cognition plus view is that of the transportation of an item from one physical location to another (see Fig. 1). This metaphor supports the characterization of knowledge as an internal cognitive structure that is constructed in one setting and subsequently transferred to other settings in which it is applied. In contrast to this treatment of knowledge as an entity that is acquired in a particular setting, situated cognition theorists are more concerned with knowing as an activity or, in other words, with types of reasoning that emerge as people participate in particular cultural practices. In line with the transfer metaphor, context, as it is defined by cognition plus theorists, consists of the task an individual is attempting to complete together with others’ actions and available tools and artifacts. Crucially, whereas situated cognition theorists view both tools and others’ actions as integral aspects of an encompassing practice that only have meaning in relation to each other, cognition plus theorists view them as aspects of what might be termed the stimulus environment that is external to internal cognitive processing. Consequently, from this latter perspective, context is under a researcher’s control just as is a subject’s physical location, and can therefore be systematically varied in experiments. Given this conception of context, cognition plus theorists can reasonably argue that cognition is not always tied to
context because people do frequently use what they have learned in one setting as they reason in other settings. However, this claim is open to dispute if it is interpreted in terms of situated cognition theory where all activity is viewed as occurring in the context of a cultural practice. Further, situated cognition theorists would dispute the claim that cognition is partly context independent because, from their point of view, an act of reasoning is necessarily an act of participation in a cultural practice. As an illustration, situated cognition theorists would argue that even a seemingly decontextualized form of reasoning such as research mathematics is situated in that mathematicians use common sign systems as they engage in the communal enterprise of producing results that are judged to be significant by adhering to agreed-upon standards of proof. For situated cognition theorists, mathematicians’ reasoning cannot be adequately understood unless it is located within the context of their research communities. Further, these theorists would argue that to be fully adequate, an explanation of mathematicians’ reasoning has to account for its development or genesis. To address this requirement, it would be necessary to analyze the process of the mathematicians’ induction into their research communities both during their graduate education and during the initial phases of their academic careers. This illustrative example is paradigmatic in that situated cognition theorists view learning as synonymous with changes in the ways that individuals participate in the practices of communities. In the case of research mathematics, these theorists would argue that mathematicians develop particular ways of reasoning as they become increasingly substantial participants in the practices of particular research communities. They would therefore characterize the mathematicians’ apparently decontextualized reasoning as embedded in these practices and thus as being situated.
2. Vygotsky’s Contribution Just as situated cognition theorists argue that an adequate analysis of forms of reasoning requires that we understand their genesis, so an adequate account of contemporary situated perspectives requires that we trace their development. All current situated approaches owe a significant intellectual debt to the Russian psychologist Lev Vygotsky, who developed his cultural-historical theory of cognitive development in the period of intellectual ferment and social change that followed the Russian revolution. Vygotsky was profoundly influenced by Karl Marx’s argument that it is the making and use of tools that serves to differentiate humans from other animal species. For Vygotsky, human history is the history of artifacts such as language, counting systems, and writing, that 14127
Situated Cognition: Origins are not invented anew by each generation but are instead passed on and constitute the intellectual bequest of one generation to the next. His enduring contribution to psychology was to develop an analogy between the use of physical tools and the use of intellectual tools such as sign systems (Kozulin 1990, van der Veer and Valsiner 1991). He argued that just as the use of a physical tool serves to reorganize activity by making new goals possible, so the use of sign systems serves to reorganize thought. From this perspective, culture can therefore be viewed as a repository of sign systems and other artifacts that are appropriated by children in the course of their intellectual development (Vygotsky 1978). It is important to understand that for Vygotsky, children’s mastery of a counting system does merely enhance or amplify an already existing cognitive capability. Instead, children’s ability to reason numerically is created as they appropriate the counting systems of their culture. This example illustrates Vygotsky’s more general claim that children’s minds are formed as they appropriate sign systems and other artifacts. Vygotsky refined his thesis that children’s cognitive development is situated with respect to the sign systems of their culture as he pursued several related lines of empirical inquiry. In his best known series of investigations, he attempted to demonstrate the crucial role of face-to-face interactions in which an adult or more knowledgeable peer supports the child’s use of an intellectual tool such as a counting system (Vygotsky 1962). For example, one of the child’s parents might engage the child in a play activity in the course of which parent and child count together. Vygotsky interpreted observations of such interactions as evidence that the use of sign systems initially appears in children’s cognitive development on what he termed the ‘intermental plane of social interaction.’ He further observed that over time the adult gradually reduces the level of support until the child can eventually carry out what was previously a joint activity on his or her own. This observation supported his claim that the child’s mind is created via a process of internalization from the intermental plane of social interaction to the intramental plane of individual thought. A number of Western psychologists revived this line of work in the 1970s and 1980s by investigating how adults support or scaffold children’s learning as they interact with them. However, in focusing almost exclusively on the moves or strategies of the adult in supporting the child, they tended to portray learning as a relatively passive process. This conflicts with the active role that Vygotsky attributed to the child in constructing what he referred to as the ’higher mental functions’ such as reflective thought. In addition, these turn-by-turn analyses of adult–child interactions typically overlooked the broader theoretical orientation that informed Vygotsky’s investigations. In focusing on social interaction, he attempted to demonstrate 14128
both that the environment in which the child learns is socially organized and that it is the primary determinant of the forms of thinking that the child develops. He therefore rejected descriptions of the child’s learning environment that are cast in terms of absolute indices because such a characterization defines the environment in isolation from the child. He argued that the analyses should instead focus on what the environment means to the child. This, for him, involved analyzing the social situations in which the child participates and the cognitive processes that the child develops in the course of that participation. Thus far, in discussing Vygotsky’s work, I have emphasized the central role that he attributed to social interactions with more knowledgeable others. There is some indication that shortly before his death in 1934 at the age of 36, he was beginning to view the relation between social interaction and cognitive development as a special case of a more general relation between cultural practices and cognitive development (Davydov and Radzikhovskii 1985, Minick 1987). In doing so he came to see face-to-face interactions as located within an encompassing system of cultural practices. For example, he argued that it is not until children participate in the activities of formal schooling that they develop what he termed scientific concepts. In making this claim, he viewed classroom interactions between a teacher and his or her students as an aspect of the organized system of cultural practices that constituted formal schooling in the USSR at that time. A group of Soviet psychologists, the most prominent of whom was Alexei Leont’ev, developed this aspect of cultural-historical theory more fully after Vygotsky’s death.
3. Leont’e’s Contribution Two aspects of Leont’ev’s work are particularly significant from the vantage point of contemporary situated perspectives. The first concerns his clarification of an appropriate unit of analysis when accounting for intellectual development. Although face-to-face interactions constitute the immediate social situation of the child’s development, Leont’ev saw the encompassing cultural practices in which the child participates as constituting the broader context of his or her development (Leont’ev 1978, 1981). For example, Leont’ev might have viewed an interaction in which a parent engages a child in activities that involve counting as an instance of the child’s initial, supported participation in cultural practices that involve dealing with quantities. Further, he argued that children’s progressive participation in specific cultural practices underlies the development of their thinking. Intellectual development was, for him, synonymous with the process by which the child becomes a full participant in particular cultural practices. In other words, he viewed the development of children’s minds and
Situated Knowledge: Feminist and Science and Technology Studies Perspecties their increasingly substantial participation in various cultural practices as two aspects of a single process. This is a strongly situated position in that the cognitive capabilities that children develop are not seen as distinct from the cultural practices that constitute the context of their development. From this perspective, the cognitive characteristics a child develops are characteristics of the child-in-cultural-practice in that they cannot be defined apart from the practices that constitute the child’s world. For Leont’ev, this implied that the appropriate unit of analysis is the child-inculture-practice rather than the child per se. As stated earlier, it is this rejection of the individual as a unit of analysis that separates contemporary situated perspectives from what I called the cognition plus viewpoint. Leont’ev’s second important contribution concerns his analysis of the external world of material objects and events. Although Vygotsky brought sign systems and other cultural tools to the fore, he largely ignored material reality. In building on the legacy of his mentor, Leont’ev argued that material objects as they come to be experienced by developing children are defined by the cultural practices in which they participate. For example, a pen becomes a writing instrument rather than a brute material object for the child as he or she participates in literacy practices. In Leont’ev’s view, children do not come into contact with material reality directly, but are instead oriented to this reality as they participate in cultural practices. He therefore concluded that the meanings that material objects come to have are a product of their inclusion in specific practices. This in turn implied that these meanings cannot be defined independently of the practices. This thesis only served to underscore his argument that the individual-cultural-practice constitutes the appropriate analytical unit for psychology. See also: Cultural Psychology; Learning by Occasion Setting; Situated Cognition: Contemporary Developments; Situated Knowledge: Feminist and Science and Technology Studies Perspectives; Situated Learning: Out of School and in the Classroom; Transfer of Learning, Cognitive Psychology of; Vygotskij, Lev Semenovic (1896–1934); Vygotskij’s Theory of Human Development and New Approaches to Education
Bibliography Cobb P, Bowers J 1999 Cognitive and situated perspectives in theory and practice. Educational Researcher 28(2): 4–15 Cole M 1996 Cultural Psychology. Harvard University Press, Cambridge, MA Davydov V V, Radzikhovskii L A 1985 Vygotsky’s theory and the activity-oriented approach in psychology. In: Wertsch J V (ed.) Culture, Communication, and Cognition: Vygotskian Perspecties. Cambridge University Press, New York, pp. 35–65
DeAbreu G 1995 Understanding how children experience the relationship between home and school mathematics. Mind, Culture, and Actiity 2: 119–42 Kozulin A 1990 Vygotsky’s Psychology: A Biography of Ideas. Harvard University Press, Cambridge, MA Lave J 1988 Cognition in Practice: Mind, Mathematics, and Culture in Eeryday Life. Cambridge University Press, New York Lave J 1991 Situating learning in communities of practice. In: Resnick L B, Levine J M, Teasley S D (eds.) Perspecties on Socially Shared Cognition. American Psychological Association, Washington, DC, pp. 63–82 Leont’ev A N 1978 Actiity, Consciousness, and Personality. Prentice-Hall, Englewood Cliffs, NJ Leont’ev A N 1981 The problem of activity in psychology. In: Wertsch J V (ed.) The Concept of Actiity in Soiet Psychology. Sharpe, Armonk, NY, pp. 37–71 Masingila J O 1994 Mathematics practice in carpet laying. Anthropology and Education Quarterly 25: 430–62 Minick N 1987 The development of Vygotsky’s thought: An introduction. In: Rieber R W, Carton A S (eds.) The Collected Works of Vygotsky, L.S. (Vol. 1): Problems of General Psychology. Plenum, New York, pp. 17–38 Nunes T, Schliemann A D, Carraher D W 1993 Street Mathematics and School Mathematics. Cambridge University Press, Cambridge, UK Rogoff B 1995 Observing sociocultural activity on three planes: Participatory appropriation, guided participation, and apprenticeship. In: Wertsch J V, del Rio P, Alvarez A (eds.) Sociocultural Studies of Mind. Cambridge University Press, New York, pp. 139–64 Saxe G B 1991 Culture and Cognitie Deelopment: Studies in Mathematical Understanding. Erlbaum, Hillsdale, NJ Scribner S 1984 Studying working intelligence. In: Rogoff B, Lave J (eds.) Eeryday Cognition: Its Deelopment in Social Context. Harvard Univesity Press, Cambridge, pp. 9–40 Sfard A 1998 On two metaphors for learning and the dangers of choosing just one. Educational Research 27(21): 4–13 van der Veer R, Valsiner J 1991 Understanding Vygotsky: A Quest for Synthesis. Blackwell, Cambridge, MA Vygotsky L S 1962 Thought and Language. MIT Press, Cambridge, MA Vygotsky L S 1978 Mind and Society: The Deelopment of Higher Psychological Processes. Harvard University Press, Cambridge, MA
P. Cobb
Situated Knowledge: Feminist and Science and Technology Studies Perspectives The expression ‘situated knowledge,’ especially in its plural form ‘situated knowledges,’ is associated with feminist epistemology, feminist philosophy of science, and science and technology studies. The term was introduced by historian of the life sciences and feminist science and technology studies scholar Donna Haraway in her landmark essay Situated Knowledges: The Science Question in Feminism and the Priilege of Partial Perspectie (Haraway 1991, pp. 183–201). The 14129
Situated Knowledge: Feminist and Science and Technology Studies Perspecties essay was a response to feminist philosopher of science Sandra Harding’s discussion of the ‘science question in feminism’ (Harding 1986). In her analysis of the potential of modern science to contribute to the goals of feminism, Harding noted three different accounts of objective knowledge in feminist epistemology: feminist empiricism, feminist standpoint, and feminist postmodernism (Harding 1986, pp. 24–6). Feminist empiricism attempted to replace more biased with less biased science. The feminist standpoint, echoing the Marxist tradition from which it derived, stressed the relevance of the social positioning of the knower to the content of what is known. Feminist postmodernism accentuated the power dynamics underlying the use of the language of objectivity in science. Haraway, taking off from Harding, diagnosed a ‘Scylla and Charybdis’ of temptations between which feminists attempt to navigate on the question of objectivity: radical constructionism and feminist critical empiricism. As she put it, what feminists wanted from a theory of objectivity was ‘enforceable, reliable accounts of things not reducible to power moves and agonistic, high status games of rhetoric or to scientistic, positivist arrogance’ (Haraway 1991, p. 188). Her notion of situated knowledges was her attempt to provide just such a theory of objectivity. Despite the single author provenance, the term resonated with attempts by other scholars in the history, sociology, and philosophy of science to address similar epistemological tensions. The concept was quickly and fruitfully taken up in science and technology studies and feminist theory, provoking a certain amount of reworking in its turn.
1. ‘Situated Knowledge’ and Objectiity: Constructed Yet Real The term ‘situated knowledge’ derives its theoretical importance from its seemingly oxymoronic character, particularly when applied to knowledge about the natural world. It is common to think of modern scientific knowledge as universal, so that it has the same content no matter who possesses it. It is also almost definitional to hold that objective knowledge is warranted by the fact that it captures reality as it really is, rather than being warranted by the situational circumstances out of which the knowledge was generated or discovered (see Shapin 1994, pp. 1–8 for discussion of, and citations relevant to, various manifestations of these points). Thus, if the law of gravity enables us to make reliable experimental predictions, it is because there is such a thing as gravity that is adequately captured by our scientific understanding; in short, the truth of the knowledge is its own warrant. It is only in the case of false or superseded knowledge that we typically explain what went wrong by reference to faulty assumptions, sloppy work, ill-calibrated equipment, the Zeitgeist, or other aspects of the 14130
context of discovery. The idea of ‘situated knowledge’ contests these supposed concomitants of objective knowledge. It suggests that objective knowledge, even our best scientific knowledge of the natural world, depends on the partiality of its material, technical, social, semiotic, and embodied means of being promulgated. Haraway’s notion thus has affinities with other feminist epistemologies which have noted that facts can differ in their content from one time, place, and knower to another (e.g., Collins 1989). It also has sympathies in common with sociologists of science and scholars of science and technology studies who have suggested that capturing ‘reality as it really is’ may be dependent on institutional, technical, and cultural norms (Kuhn (1962\1970), on practice (Clarke and Fujimura 1992, Pickering 1992), and attempts to witness, measure, comprehend, or command assent to it (Latour 1987, Shapin 1994, Shapin and Schaffer 1985). All these scholars share a search for the theoretical resources to do justice to the embeddedness of science and truth. These challenges to conventional views of objectivity bring situated knowledges into conversation with key debates in the philosophy of science around the theory-ladenness of facts (Hesse 1980, pp. 63–110). Additionally, the suspicion of transcendent universalism entrains an epistemological and political distrust of clear-cut distinctions between subject and object, and a blurring of the distinction between context and content of knowledge or discovery. Situated knowledges are as hostile to relativism as they are to realism. Haraway describes relativism as ‘being nowhere while claiming to be everywhere equally’ (Haraway 1991, p. 191) and realism as ‘seeing everything from nowhere’ (Haraway 1991, p. 189), and conceives of them both as ‘god-tricks’ promising total, rather than partial, located, and embodied, vision. In contrast to realist or relativist epistemologies, Haraway sees the possibility of sustained and rational objective inquiry in the epistemology of partial perspectives. This requires, she maintains, reclaiming vision as a series of technological and organic embodiments, as and when and where and how vision is actually enabled. This crafting of a feminist epistemology of situated knowledges on the basis of vision and partial perspective is noteworthy. The links in the history of science to militarism, capitalism, colonialism, and male supremacy have been theorized around the masculinist gaze of the powerful but disembodied knower disciplining and subjugating the weak by means of a multitude of technologies of surveillance. Feminists have lamented the privilege granted to the visual as a sure basis of knowledge and bemoaned the sidelining in modernity of what some cast as more feminine and less intrinsically violent ways of knowing involving emotion, voice, touch, and listening (Gilligan 1982). Haraway is concerned that feminists not cede power to those whose practices they wish
Situated Knowledge: Feminist and Science and Technology Studies Perspecties critically to engage. It is in this spirit that she grounds her feminist solution in an embrace of science and vision, ‘the real game in town, the one we must play’ (Haraway 1991, p. 184).
2. The Feminist Roots of the Dilemma A tension between emancipatory empiricism and its associated egalitarian or socialist politics, and feminist postmodern constructionism and its associated identity politics, resonates throughout contemporary Western feminist theory. It is a recent hallmark of those engaged in feminist philosophical and social studies of science that they seek to resolve one or another version of this tension. One horn of the feminist dilemma, according to Haraway, represents the good feminist reasons to be attracted to radical constructionism. Feminist postmodernists, and analysts of science and technology influenced by semiotics (including Haraway herself ), helped develop and often appeal to ‘a very strong social constructionist argument for all forms of knowledge claims’ including scientific ones (Akrich and Latour 1992, Haraway 1991, p. 184). This position has the benefit of showing the links between power—such things as status, equipment, rhetorical privilege, funding, and so on—and the production of knowledge and credibility. The downside, from the point of view of feminists interested in arguing for a better world, is that the radical constructionist argument risks rendering all knowledges as fundamentally ideological, with no basis for choosing between more and less just ideas, more and less true versions of reality. As Haraway provocatively expressed it, embracing this temptation seemed to leave no room for ‘those of us who would still like to talk about reality with more confidence than we allow the Christian right’s discussion of the Second Coming and their being raptured out of the final destruction of the world’ (Haraway 1991, p. 185). The second horn of the dilemma, according to Haraway, involves ‘holding out for a feminist version of objectivity’ through materialism or empiricism (Haraway 1991, p. 186). Haraway briefly discusses both Marxist derived feminisms and feminist empiricism. Feminisms with Marxist inspirations are several, and their genealogy can be traced in a number of ways. Feminists have long criticized Marxist humanism for its premise that the self-realization of man is dependent on the domination of nature, and for its account of the historical progression of modes of production that grants no historical agency to domestic and unpaid labor (Hartmann 1981). Some have responded by developing feminist versions of historical materialism (Hartsock 1983). Feminist standpoint theorists appropriated the general insight, inherited from both Marxist thought and the sociology of knowledge, that one’s social-structural position in
society—such things as one’s class or relation to the means of production, or one’s gender or ethnonational characteristics—determine or affect how and what one knows (Smith 1990). Likewise, the idea that some social structural positions confer epistemological privilege has been widely adopted by standpoint theorists and feminist epistemologists arguing for specifically feminine ways of knowing (Rose 1983). ‘Seeing from below,’ that is, from a position of subordination, has commonly been theorized by feminists as the position of epistemological privilege, on the grounds that those with little to gain from internalizing powerful ideologies would be able to see more clearly than those with an interest in reproducing the status quo. In Patricia Hill Collins’ version of standpoint theory, for example, these insights are used both to validate the knowledges of the historically disenfranchised, and to reverse the hegemonic ranking of knowledge and authority, and claim epistemological privilege for African–American women (Collins 1989). Psychoanalytic theory, particularly anglophone object relations theory, inspired some of the early writings on gender and science (Chodorow 1978, Keller 1985). Object relations theory attempted to explain the different relation of women and men to objectivity, abstract thought, and science in modern societies. To account for this difference, the theory posited gender-based differences in the socialization of sons and daughters in Western middle class heterosexual nuclear families. Boys, according to this theory, are socialized to separate from the primary caregiver who is the mother in this normative family scenario. They thus learn early and well by analogy with their emotional development that relational thinking is inappropriate for them; separating themselves from the object of knowledge, as from the object of love, is good. Girls, on the other hand, are supposedly socialized to be like their primary caregiver, so that they can reproduce mothering when their turn comes. Relationality and connectivity, not abstraction and separation, are the analogous ordering devices of girls’ affective and epistemological worlds. As applied to objectivity and scientific knowledge, object relations theory seemed to explain to feminists, without resort to distasteful biological determinisms denying women scientific aptitude, why women were excluded from much of science and technology. It also suggested that there were (at least) two distinct ways of knowing, and that much might have been lost in the violence and separation of masculinist science that could be restored by a proper valuation of the feminine values of connection and empathy (Harding 1986, Keller 1983). Like Marxism, psychoanalytic approaches to objectivity gave feminists a means to show the relevance of one’s social position to knowledge. Like feminist empiricism, they encouraged the belief in the possibility of an improved, feminist, objectivity (Harding 1992). 14131
Situated Knowledge: Feminist and Science and Technology Studies Perspecties The feminist canon contains a number of empirical studies that have revealed the negative effects of such things as colonialism and stereotypes about race and gender on the production of reliable science (FaustoSterling 1995, Martin 1991\1996, Schiebinger 1989, Traweek 1988). Evelyn Fox Keller’s call for ‘dynamic objectivity’ (Keller 1985, pp. 115–26) and Sandra Harding’s demand for ‘strong objectivity’ (Harding 1992, p. 244) are exemplary of the aspirations of theoretical feminist empiricism. These projects seek to prescribe scientific methods capable of generating accounts of the world that would improve upon disembodied, masculinist portrayals of science because they would be alert to the practices of domination and oppression inherent in the creation, dissemination, and possession of knowledge. Feminist empiricism nonetheless remains problematic because of its reliance on the dichotomies of bias vs. objectivity, use vs. misuse, and science vs. pseudoscience. The feminist insight of the ‘contestability of every layer of the onion of scientific and technological constructions’ (Haraway 1991, p. 186) flies in the face of leaving these epistemological dichotomies intact.
3. Subjects, Objects, and Agency Haraway’s notion of situated knowledges problematizes both subject and object. Unlike standpoint theories which attribute epistemological privilege to subjugated knowers, and the sociology of knowledge which attributes espitemological privilege to those in the right structural position vis-a' -vis a given mode of production, Haraway attributes privilege to partiality. This shift underscores that ‘situated knowledge’ is more dynamic and hybrid than other epistemologies that take the position of the knower seriously, and involves ‘mobile positioning’ (Haraway 1991, p. 192) In situated knowledges based on embodied vision, neither subjects who experience, nor nature which is known, can be treated as straightforward, pretheoretical entities, ‘innocent and waiting outside the violations of language and culture’ (Haraway 1991, p. 109). Haraway maintains that romanticizing, and thus homogenizing and objectifying, the perfect subjugated subject position is not the solution to the violence inherent in dominant epistemologies. As feminists from developing countries have also insisted, there is no innocent, perfectly subjugated feminist subject position conferring epistemological privilege; all positionings are open to critical re-examination (Mohanty 1984\1991). Subjectivity is instead performed in and through the materiality of knowledge and practice of many kinds (Butler 1990, pp. 1–34). Conversely, the extraordinary range of objects in the physical, natural, social, political, biological, and human sciences about which institutionalized knowledge is produced should not be considered to be passive and inert. Haraway says that situated knowledges require thinking of the world in terms of the 14132
‘apparatus of bodily production.’ The world cannot be reduced to a mere resource if subject and object are deeply interconnected. Bodies as objects of knowledge in the world should be thought of as ‘material-semiotic generative nodes,’ whose ‘boundaries materialize in social interaction’ (Haraway 1991, p. 201). The move to grant agency to material objects places the epistemology of situated knowledges at the center of recent scholarship in science and technology studies (Callon 1986, Latour 1987).
4. Uptake and Critique Donna Haraway’s essay ranks among the most highly cited essays in science and technology studies and has been anthologized. As stated above, situated knowledges is a provocative and rich methodological metaphor with resonances in many quarters. The dialogue between Harding and Haraway continued after the publication of Situated Knowledges (Harding 1992, pp. 119–63). Her epistemology has directly influenced, and has in turn been influenced by, the recent work of sociologists and anthropologists of science (Clarke and Montini 1993, Rapp 1999), feminist philosophers of science (Wylie 1999), and practicing feminist scientists (Barad 1996). In addition, ‘situated knowledges’ is used as a standard technical term of the field by more junior scholars. Critics of situated knowledges have been few. Timothy Lenoir has pointed out that many of the epistemological ideas behind Haraway’s situated knowledges are found not only in other major strands of science and technology studies, but also in the work of continental philosophers such as Nietzsche. He likewise critiqued the idea of situated knowledges for its dependence on the apparatus of semiotics (Lenoir 1999, pp. 290–301). Historian Londa Schiebinger, in her recent book summarizing the effects of a generation of feminist scholarship on the practice of science, places Haraway’s situated knowledges together with Harding’s strong objectivity as attempts to integrate social context into scientific analysis (Schiebinger 1999). Implicit critiques have been leveled against the limitations of the idea of being situated, for example, in the development of De Laet’s and Mol’s mobile epistemology (De Laet and Mol 2000). Sheila Jasanoff and her colleagues have argued for bringing differently spatialized entities such as the nation, the local, and the global, into the epistemology of science and technology studies, while retaining the insights gained by paying attention to practice, vision, and measurement. These critiques stand more as continuing conversations with, than rebuttals of situated knowledges, however. Overall, the idea of situated knowledges remains central to feminist epistemology and science studies and to attempts to understand the role of modern science in society.
Situated Learning: Out of School and in the Classroom See also: Cognitive Psychology: Overview; Contextual Studies: Methodology; Feminist Epistemology; Feminist Political Theory and Political Science; Feminist Theory; Feminist Theory: Postmodern; Knowledge, Anthropology of; Knowledge, Sociology of; Rationality and Feminist Thought; Science and Technology, Anthropology of; Scientific Knowledge, Sociology of; Situation Model: Psychological
Bibliography Akrich M, Latour B 1992 A summary of a convenient vocabulary for the semiotics of human and nonhuman assumblies. In: Bijker W, Law J (eds.) Shaping Technology\Building Society: Studies in Sociotechnical Change. MIT Press, Cambridge, MA, pp. 259–64 Barad K 1996 Meeting the universe halfway: realism and social constructivism without contradiction. In: Hankinson Nelson L, Nelson J (eds.) Feminism, Science, and the Philosophy of Science. Kluwer, Dordrecht, The Netherlands, pp. 161–94 Butler J 1990 Gender Trouble: Feminism and the Subersion of Identity. Routledge, London Callon M 1986 Some elements of a sociology of translation: domestication of the scallops and fishermen of St. Brieuc Bay. In: J Law J (ed.) Power, Action and Belief: A New Sociology of Knowledge. Routledge, London, pp. 196–233 Chodorow N 1978 The Reproduction of Mothering. University of California Press, Berkeley Clarke A, Fujimura J (eds.) 1992 The Right Tools for the Job: At Work in Twentieth-Century Life Sciences. Princeton University Press, Princeton, NJ Clarke A, Montini T 1993 The many faces of RU486: Tales of situated knowledges and technological contestations. Science, Technology and Human Values 18: 42–78 Collins P H 1989 The social construction of black feminist thought. Signs 14(4): 745–73 De Laet M, Mol A 2000 The Zimbabwe bush pump. Mechanics of a fluid technology. Social Studies of Science Fausto-Sterling A 1995 Gender, race, and nation: The comparative anatomy of ‘Hottentot’ Women in Europe, 1815–1817. In: Terry J, Urla J (eds.) Deiant Bodies: Critical Perspecties on Difference in Science and Popular Culture. Indiana University Press, Bloomington, IN, pp. 19–48 Gilligan C 1982 In a Different Voice: Psychological Theory and Women’s Deelopment. Harvard University Press, Cambridge, MA Haraway D 1991 Simians, Cyborgs and Women: The Reinention of Nature. Routledge, New York Harding S 1986 The Science Question in Feminism. Cornell University Press, Ithaca, NY Harding S 1992 Whose Science? Whose Knowledge? Thinking from Women’s Lies. Cornell University Press, Ithaca, NY Hartmann H 1981 The unhappy marriage of Marxism and feminism. In: Sargent L (ed.) Women and Reolution. South End Press, Boston Hartsock N 1983 The feminist standpoint: Developing the ground for a specifically feminist historical materialism. In: Harding S, Hintikka M (eds.) Discoering Reality: Feminist Perspecties on Epistemology, Metaphysics, Methodology, and Philosophy of Science. Reidel, Dordrecht, The Netherlands, pp. 283–310 Hesse M 1980 Reolutions and Reconstructions in the Philosophy of Science. Indiana University Press, Bloomington, IN
Keller E F 1983 A Feeling for the Organism: The Life and Work of Barbara McClintock. W. H. Freeman and Company, New York Keller E F 1985 Reflections on Gender and Science. Yale University Press, New Haven, CT Kuhn T S 1962\1970 The Structure of Scientific Reolutions, 2nd edn. Chicago University Press, Chicago Latour B 1987 Science in Action: How to Follow Scientists and Engineers Through Society. Open University Press, Philadelphia, PA Lenoir T 1999 Was the last turn the right turn? The semiotic turn and A. J. Greimas. In: Biagioli M (ed.) The Science Studies Reader. Routledge, London, pp. 290–301 Martin E 1991\1996 The egg and the sperm: how science has constructed a romance based on stereotypical male–female roles. In: Laslett B, Kohlstedt G S, Longino H, Hammonds E (eds.) Gender and Scientific Authority. Chicago University Press, Chicago, pp. 323–39 Mohanty C T 1984\1991 Under Western eyes: Feminist scholarship and colonial discourses. In: Mohanty C, Russo A, Torres L (eds.) Third World Women and the Politics of Feminism. Indiana University Press, Bloomington, IN, pp. 51–80 Pickering A (ed.) 1992 Science as Practice and Culture. University of Chicago Press, Chicago Rapp R 1999 Testing Women, Testing the Fetus: The Social Impact of Amniocentesis in America. Routledge, London Rose H 1983 Hand, brain, and heart: A feminist epistemology for the natural sciences. Signs 9(1): 73–90 Schiebinger L 1989 The Mind Has No Sex? Women in the Origins of Modern Science. Harvard University Press, Cambridge, MA Schiebinger L 1999 Has Feminism Changed Science? Harvard University Press, Cambridge, MA Shapin S 1994 A Social History of Truth: Ciility and Science in Seenteenth-century England. University of Chicago Press, Chicago Shapin S, Schaffer S 1985 Leiathan and the Air-pump: Hobbes, Boyle, and the Experimental Life. Princeton University Press, Princeton, NJ Smith D 1990 The Conceptual Practices of Power; A Feminist Sociology of Knowledge. University of Toronto Press, Toronto, ON Traweek S 1988 Beamtimes and Lifetimes: The World of High Energy Physicists. Harvard University Press, Cambridge, MA Wylie A 1999 The engendering of archaeology: Refiguring feminist science studies. In: Biagioli M (ed.) The Science Studies Reader. Routledge, London, pp. 553–68
C. M. Thompson
Situated Learning: Out of School and in the Classroom Situated learning is not a unitary, well-defined concept. From an educational point of view, the core idea behind the different uses of this term is to create a situational context for learning that strongly resembles 14133
Situated Learning: Out of School and in the Classroom possible application situations in order to assure that the learning experiences foster ‘real-life’ problem solving.
1. Situated Cognition–Situated Learning Since 1985, the notion of situatedness of learning and knowing has become prominent in a variety of scientific disciplines such as psychology, anthropology, and computer science. In this entry, however, we focus only on situatedness approaches in education and educational psychology. Since the late 1980s a particular educational problem has received much attention. In traditional forms of instruction, learners acquire knowledge that they can explicate, for example in examinations. When the learners are, however, confronted with complex ‘reallife’ problems, this knowledge is frequently not used, although it is relevant. Such knowledge is termed ‘inert knowledge’ (Renkl et al. 1996). This frequently found inertness phenomenon motivated some researchers to postulate that the whole notion of knowledge that is used in cognitive psychology as well as in everyday reasoning about educational issues is wrong (e.g., Lave 1988). It has been argued that knowledge is not some ‘entity’ in a person’s head that can be acquired in one situational context (e.g., classroom) and then be used in another context (e.g., workplace), but that instead it is context-bound. Hence, symbolic modeling of cognitive processes and structures is regarded as inappropriate. From a situatedness perspective, knowledge is generally constituted by the relation or interaction between an agent and the situational context they are acting in. Hence, it is proposed that we use the term ‘knowing’ instead of ‘knowledge’ in order to underline the process aspect of knowing\knowledge (Greeno et al. 1993). As a consequence of this conception of knowledge, learning must also be conceived as contextbound or situated. Thus, it is understandable that there is no transfer between such different contexts as the classroom and everyday life. The theoretical assumption of the situatedness of knowing and learning also has consequences for research methodology. Laboratory experiments can no longer be seen as appropriate, because in this research strategy the phenomena are put out of their ‘natural’ context and their character is changed (e.g., Lave 1988). Situatedness proponents conduct primarily qualitative field studies. Unfortunately, the term of situatedness is not well defined. One reason for the fuzziness of this construct is that its proponents differ in how far they depart from traditional cognitive concepts. Whereas Lave (1988), for example, radically rejects the notions of knowledge, transfer, and symbolic representations as not being sensible, other proponents, such as Greeno et al. (1993), hold a more modest position. Although 14134
they also stress the situatedness of knowing and learning, they aim, for example, to analyze conditions for transfer, or claim that representations can play a role in human activity. A factor that further contributes to the vagueness of the notion of situated learning is that it is used in two ways: descriptive and prescriptive. When descriptively used, situated learning means that learning is analyzed as a context-bound process. In the field of education, situated learning is, however, mostly used as prescriptive concept. It is argued that learning should be situated in a context that resembles the application contexts. The prescriptive aspect of situated learning was quite appealing to the community of educational researchers in the 1990s. Hence, not only those who subscribe to the situated cognition perspective (i.e., rejection of cognitive concepts), but also people who still more or less remain within the traditional cognitive framework rely on this concept. The latter group often juxtapose situated learning with decontextualized (traditional) learning, in which concepts and principles are presented and acquired in an abstract way with little or no relation to ‘real-world’ problems. It is important to note that this ‘assimilated’ view, in principle, contradicts the more fundamental situatedness concept according to which there is no nonsituated learning. Even typical abstract school learning is situated in the very specific context of school culture, although this situatedness would usually be evaluated as unfavorable because of the differences between school and ‘real-life’ contexts. Hence, irrespective of the logical inconsistencies in the use of the notion of situated learning, the prescriptive notion of situated learning means that the learning and the application situations should be as similar as possible in order to assure that the learning experiences have positive effects on ‘real-life’ problem solving.
2. Learning and Problem Soling in School and Outside From a situatedness perspective, typical school learning is not much help for problem solving in everyday or professional life because the contexts are too different. Based among others upon Resnick’s (1987) analyses of typical differences between these contexts, the following main points are outlined: (a) Well-defined problems in school ersus ill-defined problems outside. For example, word problems in mathematics are usually well defined. It is clear what the problem is all about, all the necessary information is given in the problem formulation, there is usually only one way to arrive at the solution that is labeled as appropriate, and so on. Nontrivial, ‘real-life’ problems (e.g., improvement of an organization’s communication structure) first often have to be defined more precisely for a productive solution: one has to decide whether one needs more information, one has to seek
Situated Learning: Out of School and in the Classroom information, one has to decide what is relevant and irrelevant information, there are multiple ways of solving a problem, and so on. (b) Content structured by theoretical systems in school ersus structured by problems outside. In traditional forms of instruction, content is structured according to theoretical systematizations (e.g., biological taxonomies). This helps learners to organize the content and to remember it. One very salient systematization of content is the distinction between different school subjects. When a ‘real-life’ problem (e.g., pollution in a city or evaluation of an Internet company’s share) has to be solved, thinking within the boundaries of school subjects is often not helpful. Furthermore, the structure of concepts used in school is frequently not relevant to the problem at hand. In ‘real life,’ the nature of the problem to be solved determines what concepts and information are required and in which structure they are needed. (c) Indiidual cognition in school ersus shared cognition outside. In schools, the usual form of learning and performance is individualistic. Cooperation in examinations is even condemned. In professional or everyday life, in contrast, cooperation is valued and it is frequently necessary for solving problems. (d) Pure mentation in school ersus tool manipulation outside. In traditional instruction, pure ‘thought’ activities are dominating. Students should learn to perform without support of tools such as books, notes, calculators, etc. Especially in exams, tools are usually forbidden. In contrast, a very important skill in everyday or professional life is the competent use of tools. (e) Symbol manipulation in school ersus contextualized reasoning outside. Abstract manipulation of symbols is typical of traditional instruction. Students often fail to match symbols and symbolic processes to ‘real-world’ entities and processes. In ordinary life, on the other hand, not only are tools used, but reasoning processes are an integral part of activities that involve objects and other persons. ‘Realworld’ reasoning processes are typically situated in rich situational contexts. (f ) Generalized learning in school ersus situationspecific competencies outside. One reason for the abstract character of traditional instruction is that it aims to teach general, widely usable skills and theoretical principles. Nobody can foresee what types of specific problem students will encounter in their later life. In everyday and professional life, in contrast, situation-specific skills must be acquired. Further learning is mostly aimed at competencies for specific demands (e.g., working with a new computer program). On the one hand, this list is surely not complete and more differences could be outlined. On the other hand, this juxtaposition is somewhat oversimplifying. Nevertheless, there is a core of truth because typical learning in and out of school differs significantly. Given these
differences, a consequence of situatedness assumptions is to claim that learning environments in school should strongly resemble application contexts or that learning should take place in the field (e.g., ‘on the job’). A traditional model in which people acquire applicable skills in authentic contexts is apprenticeship learning.
3. Apprenticeship Learning as Situated Learning The situatedness protagonists Lave and Wenger (1991) investigated out of school learning in the form of apprenticeship by means of qualitative analyses. They focused on people working in traditional skills, for example Indian midwives in Mexico, tailors in Liberia, and butchers in supermarkets. In these traditional apprenticeships, learners acquire mainly manual skills. In modern society, in contrast, ‘cognitive’ domains prevail (e.g., computer science, psychology), so that skilled activity is hardly ‘visible.’ Also in school learning, cognitive skills such as mathematical problem solving, reading, and writing dominate. Against this background, Collins et al. (1989) developed the instructional cognitive apprenticeship model. Herein the importance of explication or reification of cognitive processes (e.g., strategies, heuristics) during learning is stressed. Thus, cognitive processes can be approximately as explicit as the more manual skills trained in traditional apprenticeship. The core of cognitive apprenticeship is a special instructional sequence and the employment of authentic learning tasks. Experts provide models in applying their knowledge in authentic situations. Thereby they externalize (verbalize) their reasoning. The learners then work on authentic tasks of increasing complexity and diversity. An expert or teacher is assigned an important role as a model and as a coach providing scaffolding. The learners are encouraged increasingly to take an active role, as the support by the expert is gradually faded out. Articulation is promoted so that normally internal processes are externalized and can be reflected. This means that one’s own strategies can be compared with those of experts, are then open to feedback, and can be discussed. In addition, the student’s own cognitive strategies can be compared with those of other students. In the course of interaction with experts and other learners, students can also get to know different perspectives on concepts and problems. As a result of this instructional sequence, the students increasingly work on their own (exploration) and may take over the role initially assumed by the expert. Lave and Wenger (1991) have characterized such a sequence as development from legitimate peripheral participation to full participation. It is important to note that the type of apprenticeship learning that is envisioned by the proponents of the situatedness approaches implies much more than the acquisition of ‘subject-matter knowledge.’ It 14135
Situated Learning: Out of School and in the Classroom is a process of enculturation. The learner gradually develops the competence to participate in a community of practice. Such participation presupposes more than the type of knowledge usually focused on in classroom learning. In addition, ‘tricks of the trade’ and knowledge of social norms, for example, are required.
4. Problem-based Learning as Situated Learning Besides apprenticeship, problem-based learning is another possibility for implementing arrangements in accordance with a situatedness rationale. Learning should be motivated by a complex ‘real-world’ problem that is the starting point of a learning process. For example, Greeno and the Middle School Mathematics Through Applications Project Group (MMAP) (1998) designed learning arrangements in which mathematical reasoning is not triggered primarily in separate mathematics lessons, but within design activities in four domains: architecture, population biology, cryptography, and cartography. The design activities are supported by the employment of computer tools which are also typical of current situated learning arrangements. An instructional principle is to induce quantitative reasoning involving proportions, ratios, and rates during design activities that strongly resemble the activities of many everyday crafts and commercial practices. The mathematical reasoning within design activities can be quite sophisticated; however, it often remains implicit. The teachers’ task is to uncover the mathematics the students are implicitly using. For this purpose, there are, among others, curricular materials for ‘math extension’ units (i.e., explicit mathematics lessons).
5. Common Critiques of the Situatedness Approach The situatedness camp has criticized the fundamental assumptions of cognitively oriented educational research. Hence it is not surprising that the situatedness camp has also been heavily attacked. Three major objections against the situatedness approach and corresponding defending arguments should be outlined: (a) Faulty fundamental assumptions: Anderson et al. (1996), in particular, have argued that the situatedness approach is based on wrong assumptions such as ‘knowledge does not transfer between tasks’ or ‘training in abstraction is of little use.’ Anderson et al. cited empirical studies that contradict these assumptions. In his reply, Greeno (1997) argues that the assumptions that Anderson et al. criticize are actually wrong, but that they are not the claims of the situativity approach. The core of a situativity theory is a different perspective on phenomena of learning. Instead of focusing on mental processes and structures, the 14136
situativity approach analyses ‘… the social and ecological interaction as its basis and builds toward a more comprehensive theory by … analyses of information structures in the contents of people’s interactions’ (Greeno 1997, p. 5). (b) Triiality: It is also argued that there is nothing really new in the core arguments of the situatedness theories. For example, Vera and Simon (1993) argue that all findings of the situatedness research can be incorporated into the well-elaborated traditional framework of cognitive models (i.e., symbolic paradigm). From the situatedness perspective, however, analyzing symbolic structures and processes runs too short. They may just be a special case of activity (Greeno and Moore 1993). Another triviality argument is that many claims of the situatedness protagonists were already articulated long before, for example by Dewey, Piaget, and Vygotskji, so that they are hardly stating anything new (e.g., Klauer 1999). Renkl (2000) counters that it is to the merit of the situatedness protagonists that they have reactivated these classical ideas. Most of these ideas played only a very minor role in educational mainstream research before the situatedness approach emerged. Furthermore, situated learning approaches bring classical ideas together with new developments (e.g., learning with new technologies) to form new ‘Gestalts.’ (c) Weak methodology: Klauer (1999) is one of many researchers who criticize that the situatedness protagonists employ purely qualitative research methods and that they often rely merely on anecdotes to support their claims of the situatedness of cognition and learning (e.g., the ‘cottage cheese story’; cf. Lave 1988). Lave (1988), on the other hand, rejects the empirical-experimental paradigm as artificially decontextualizing phenomena in laboratory investigations. From a situatedness point of view, it is clear that results from laboratories are of questionable value for out-of-lab contexts. It is important to note that many researchers who merely assimilated the notion of situated learning, but do not subscribe to radical situatedness, keep on researching within the traditional empirical framework. Hence, the methodological critique aims at the more radical situatedness proponents.
6. Possible Futures of Situated Learning On the one hand, at least some situatedness protagonists in the area of education make strong statements with respect to the advantage of their approach over the traditional one. On the other hand, the ‘traditionalists’ defend themselves. Against this background it is interesting to ask what the future will bring. Four main possibilities are discussed. (a) Critical traditionalists (Klauer 1999) argue that the situatedness approach and the discussion around it will disappear as other more or less fruitless debates (e.g., the person–situation debate in psychology) have done
Situation Model: Psychological before. (b) Others (e.g., Cobb and Bowers 1999) hope that the situatedness perspective will take the place of the cognitive paradigm in education, just as years ago the behavioral paradigm was driven out by the cognitive one. (c) Some ‘observers’ of the situatedness debate (e.g., Sfard 1998) argue that the two positions provide different metaphors for analyzing learning and that both are useful. Accordingly, they plea for a complementary coexistence. (d) Greeno and MMAP (1998) envision a very ambitious goal for their situativity approach; that is, to develop a situatedness approach that is a synthesis of the cognitive and the behavioral paradigm. Whatever possibility may become reality, there is at least consensus between the main opponents in the situatedness debate (cf. Anderson et al. 1997, Greeno 1997) on what the touchstone for the two approaches should be: the ability to improve education. See also: Capitalism: Global; Piaget’s Theory of Human Development and Education; School Learning for Transfer; Situated Cognition: Contemporary Developments; Situated Cognition: Origins
Bibliography Anderson J R, Reder L M, Simon H A 1996 Situated learning and education. Educational Researcher 25: 5–11 Anderson J R, Reder L M, Simon H A 1997 Situated versus cognitive perspectives: Form versus substance. Educational Researcher 26: 18–21 Cobb P, Bowers J 1999 Cognitive and situated perspectives in theory and practice. Educational Researcher 28: 4–15 Collins A, Brown J S, Newman S E 1989 Cognitive apprenticeship: Teaching the crafts of reading, writing, and mathematics. In: Resnick L B (ed.) Knowing, Learning, and Instruction. Essays in the Honor of Robert Glaser. Erlbaum, Hillsdale, NJ, pp. 453–94 Greeno J 1997 On claims that answer the wrong questions. Educational Researcher 26: 5–17 Greeno J G, Middle School Mathematics Through Applications Project Group 1998 The situativity of knowing, learning, and research. American Psychologist 53: 5–26 Greeno J G, Moore J L 1993 Situativity and symbols: Response to Vera and Simon. Cognitie Science 17: 49–59 Greeno J G, Smith D R, Moore J L 1993 Transfer of situated learning. In: Dettermann D K, Sternberg R J (eds.) Transfer on Trial: Intelligence, Cognition, and Instruction. Ablex, Norwood, NJ, pp. 99–167 Klauer K J 1999 Situated learning: Paradigmenwechsel oder alter Wein in neuen Schla$ uchen? Zeitschrift fuW r PaW dagogische Psychologie 13: 117–21 Lave J 1988 Cognition in Practice: Mind, Mathematics, and Culture in Eeryday Life. Cambridge University Press, Cambridge, UK Lave J, Wenger E 1991 Situated Learning: Legitimate Peripheral Participation. Cambridge University Press, Cambridge, UK Renkl A 2000 Weder Paradigmenwechsel noch alter Wein!— Eine Antwort auf Klauers ‘Situated Learning: Paradigmenwechsel oder alter Wein in neuen Schla$ uchen?’ Zeitschrift fuW r PaW dagogische Psychologie 14: 5–7 Renkl A, Mandl H, Gruber H 1996 Inert knowledge: Analyses and remedies. Educational Psychologist 31: 115–21
Resnick L B 1987 Learning in school and out. Educational Researcher 16: 13–20 Sfard A 1998 On two metaphors for learning and the dangers of choosing just one. Educational Researcher 27: 4–13 Vera A H, Simon H A 1993 Situated action: A symbolic interpretation. Cognitie Science 17: 7–48
A. Renkl
Situation Model: Psychological For, it being once furnished with simple ideas, it [the mind] can put them together in several compositions, and so make variety of complex ideas, without examining whether they exist so together in nature (John Locke 1690 An Essay Concerning Human Understanding).
When we read a story, we combine the ideas derived from understanding words, clauses, and sentences into mental representations of events, people, objects, and their relations. These representations are called situation models. Thus, situation models are not representations of the text itself; rather, they could be viewed as mental microworlds. Constructing these microworlds is the essential feature of understanding. At the most basic level, situation models are mental representations of events. Aspects of events that are encoded in situation models are: what their nature is, where, when, and how they occur, and who and what is involved in them. Single-event models are integrated with models of related events constructed based on the preceding text, such that situation models may evolve into complex integrated representations of large numbers of related events. This is what occurs when we comprehend extended discourse, such as news articles, novels, or historical documents.
1. Background Situation models were introduced to cognitive psychology by van Dijk and Kintsch (1983) and are based on earlier research in formal semantics and logic (e.g., Kripke 1963). The concept is primarily used in research on language and discourse comprehension. For a good understanding, it is important to distinguish situation models from similar concepts, such as mental models, scripts, and frames.
2. How Situation Models Differ 2.1 Mental Models Originally proposed by Craik (1943) and elaborated and introduced to cognitive psychology by JohnsonLaird (1983), mental models (see Mental Models, 14137
Situation Model: Psychological Psychology of ) are mental representations of real, hypothetical, or imaginary situations. Situation models can be viewed as a special type of mental model. Situation models are mental models of specific events. They are bound in time and space, whereas mental models in general are not. For example, heart surgeons have mental models of our blood circulatory system, but they construct a situation model of the state of patient X’s coronary artery at time (t).
stated (rather than implied) and thus deny the textbase a special status. Analyses of naturalistic discourse suggest that comprehenders not only construct a model of the denoted situation, but also construct a model of the communicative context. For example, they make inferences about the attitudes of writers regarding the situation they describe. Van Dijk (1999) calls this type of representation ‘context model’ and argues that no account of discourse comprehension is complete without the inclusion of a context model.
2.2 Scripts and Frames Originally proposed in the artificial intelligence literature by Schank and Abelson (1977) and Minsky (1975), respectively, scripts and frames (see Schemas, Frames, and Scripts in Cognitie Psychology) are representations of stereotypical sequences of events, such as going to a restaurant and spatial layouts, such as that of a living room or the interior of a church. Comprehenders use scripts and frames to construct situation models. For example, the restaurant script can be used to construct a mental representation of your friend’s visit to a local Italian restaurant last night. This is accomplished by filling in the slots of the script. Thus, scripts and frames can be considered types, whereas situation models are tokens. Furthermore, scripts and frames are semantic memory representations, whereas situation models are episodic memory representations.
4. Why Situation Models are Needed to Explain Language Use A task analysis of text comprehension shows the need for situation models. For example, when we comprehend the instructions that come with a household appliance, we form mental representations of the actions needed to operate or repair the device. It would be of little use to construct a mental representation of the wording of instructions themselves. Similarly, when we read newspaper articles, our usual goal is to learn and be updated about some news event. For example, we want to know how the United States House of Representatives responded to the latest budget proposal by the President or why the latest peace negotiations in the Middle East came to a halt. In these cases, we construct mental representations of agents, events, goals, plans, and outcomes, rather than merely mental representations of clauses and words.
3. Other Representations Constructed During Comprehension
5. Components of Situation Models
It is often assumed that readers construct multilevel mental representations during text comprehension. Although there is no complete consensus as to what other types of mental representations, besides situation models, are constructed during text comprehension, empirical and intuitive evidence for the role of the following representations has been put forth. The surface structure is a mental representation of the actual wording of the text. Surface–structure representations are typically short-lived in memory, except when they have a certain pragmatic relevance (for example in the case of insults or jokes) or when the surface structure is constrained by prosodic features, such as rhyme and meter, as in some poetry. The textbase is a mental representation of the semantic meaning of what was explicitly stated in the text. The textbase usually decays rather rapidly. That is, within several days, comprehenders are unable to distinguish between what they read and what they inferred (see Memory for Meaning and Surface Memory). Some researchers view the textbase simply as that part of the situation model that was explicitly
Situation models are models of events. Events always occur at a certain time and place. In addition, events typically involve participants (agents and patients) and objects. Furthermore, events often entertain causal relations with other events or are part of a goalplan structure. Thus, time, place, participants, objects, causes and effects, and goals and plans are components of situations, with time and place being obligatory. As linguists have observed, most of these components are routinely encoded in simple clauses. Verbs typically describe events (although some events can also be described by nouns, e.g., explosion), while nouns and pronouns denote participants and objects (although pronouns can also denote events) and prepositions primarily denote spatial relations (although spatial relations can also be denoted by, for instance, verbs, as in ‘The nightstand supported a lamp’ and prepositions can be used to indicate temporal relationships, as in ‘In an hour’). Temporal information can be expressed lexically in a variety of ways, but is also encoded grammatically in the form of verb tense and aspect in languages such as English, German, and French.
14138
Situation Model: Psychological Causation and intentionality are denoted lexically by verbs (e.g., ‘caused,’ ‘decided to’) or adverbs (e.g., ‘therefore,’ ‘in order to’), but are often left to be inferred by the comprehender. For example, there is no explicitly stated causal connection between following two events ‘John dropped a banana peel on the floor. The waiter slipped,’ yet comprehenders can easily infer the connection. Participants and objects may have various kinds of properties, such as having blue eyes or a cylindrical shape, and temporary features, such as being sunburnt or overheated. Finally, participants and objects may be related in various ways (e.g., kinship, professional, ownership, part–whole, beginning–end, and so on).
6. A Simple Example A simple example illustrates these points. Consider the following clause: (1) John threw the bottle against the wall.
This clause describes an event (throw) that involves a male agent (John), and two objects (bottle, wall). The simple past tense indicates that the event occurred prior to the moment at which the sentence was uttered (e.g., as opposed to John will throw the bottle against the wall ). John’s location is not explicitly stated, but we infer it is in relative proximity to the wall. We may draw inferences as to the causal antecedent and consequent of this event. A plausible antecedent is that John was angry. A plausible consequent is that the bottle will break. Comprehenders are more likely to infer causal antecedents than causal consequences. Note that there are many other inferences that could be drawn. For example, one might infer that John is a middle-aged man and that the bottle was a wine bottle and that it was green and half-filled. One might also infer that the wall was a brick wall, and so on. Comprehenders typically make relatively few of this kind of elaborative inferences. Rather, they are focused on the causal chain of events (Graesser et al. 1994).
7. Situation Models Establish Coherence A major role of situation models in discourse comprehension is to help establish coherence. Suppose sentence (1) were followed in the text by (2):
fashion. They are about some set of events. Comprehenders assume by default that a new sentence will describe the next event in the chronological sequence. Thus, they will assume that the event described in (2) directly follows that in (1). This implies temporal contiguity between (1) and (2). The use of the simple past tense is consistent with this assumption. Event 2 occurs prior to the moment of utterance, but after event 1. Given that no passage of time was mentioned, temporal contiguity is assumed. The pronoun it in sentence (2) is taken to refer to some entity already in the situation model. There are three potential referents, John, the bottle, and the wall. John is not appropriate, because animate male human beings require the pronoun he. However, the bottle and the wall are both inanimate and thus compatible with the pronoun. Because there is no linguistic way to select the proper referent, the comprehender will use background knowledge that bottles are typically made out of glass, which is breakable, and walls out of harder materials, such as brick or concrete, to infer that it was the bottle that broke. The absence of a time lapse and the presence of an object from the previous event will cause the comprehender to assume that the second event takes place in roughly the same spatial region as the first, given that the same object cannot be at two different places at the same time. Thus, the two events can now be integrated into a single situation model in which an agent, a male individual named John, at t1 threw a breakable object, a bottle, against a wall, presumably out of anger, which immediately, at t2, caused the bottle to break into pieces. This is how situation models provide coherence in discourse. Discourse is more than a sequence of sentences. Rather, it is a coherent description of a sequence of events that are related on several dimensions. Situation model theory attempts to account for how these events are successively integrated into a coherent mental representation.
8. Situation Models Integrate Textual Information with Background Knowledge Another function of situation models is that they allow for the integration of text-derived information with the comprehender’s background knowledge. Consider the following examples (from Sanford and Garrod 1998): (3) Harry put the wallpaper on the table. Then he put his mug of coffee on the paper.
(2) It shattered into a thousand pieces.
How would this sentence be comprehended? If the two sentences are part of a discourse, rather than read in isolation, they would have to be integrated in some
It is rather straightforward to integrate these sentences. They call for a spatial arrangement in which the paper is on top of the table and the mug on top of the paper and the coffee inside the mug. However, 14139
Situation Model: Psychological consider the following sentence pair, which differs from (3) by only one word: (4) Harry put the wallpaper on the wall. Then he put his mug of coffee on the paper.
Many readers will have difficulty integrating these sentences, because this discourse snippet describes an impossible set of circumstances. Realizing this impossibility relies critically on the activation of background knowledge. In this case the knowledge that putting wallpaper on a wall produces a vertical surface which would not support a mug of coffee. Thus, even though (or, rather, because) there are linguistic cues in the second sentence affording integration with the first—the pronoun can be taken to refer to Harry, and paper to wallpaper—this integration of text-derived information does not produce the correct understanding that the described situation is impossible. It is necessary to activate the requisite background knowledge to make this determination. By extension, the requisite background knowledge will help the comprehender construct an adequate representation in (3).
perceptual symbols (Barsalou 1999, Glenberg 1997, Johnson-Laird 1983). In an amodal propositional representation, there is no analog correspondence between the mental representation and its referent. In a perceptual–symbol representation, there is. As a result, perceptual symbols represent more of the perceptual qualities of the referent than do amodal symbols, which are an arbitrary code. One of the major challenges of situation-model research will be to gather empirical evidence that speaks to the distinction between amodal and perceptual symbol systems as the representational format for situation models. Both amodal and perceptual symbol systems allow for the possibility that information constructed from a text and activated background knowledge from longterm memory have the same representational format, such that they can be integrated quite easily. This is an important quality, because the integration of textderived information with background knowledge is an essential feature of comprehension.
11. Empirical Eidence 9.
Other Functions of Situation Models
Situation models are needed in translation. Word-forword translations yield nonsense in most cases. In order to arrive at a proper translation, one has to construct a situation model based on the source language and then convey this situation model in the target language. Situation models are needed to explain learning from text. When we read a newspaper article about a current event, for example a war, we update our situation model of this event. We learn what the status of the actors and objects and their relations is at a new time. We would not learn by simply storing a mental representation of the text itself in long-term memory. In fact, what we know about a current and historical political situation, is usually an amalgam of information obtained from various sources (TV news, newspapers, magazines, encyclopedias, conversations with friends, and so on).
10. The Representational Format of Situation Models Situation models often have an almost perceptual quality. In the example about the bottle, the first thing we construct is an agent, next is his action, next is the instrument of the action, and subsequently, we see the consequence of the action. There is a debate regarding the perceptual nature of situation models. Traditionally, an amodal propositional format has been proposed for situation models (van Dijk and Kintsch 1983, Kintsch 1998). However, others have proposed 14140
There is a wealth of empirical evidence supporting the notion of a situation model (see Zwaan and Radvansky 1998 for a review). Most of this evidence consists of reaction times collected while people are reading texts. In addition, there is an increasing amount of electrophysiologal evidence (see, for example, Mu$ nte et al. 1998). Analyses of reading times, as well as electrophysiological measures suggest that comprehenders have difficulty integrating a new event into the situation model when that event has different situational parameters from the previously described event, for instance when it occurs in a different time frame or involves a new participant. Evidence suggests that ease of integration of an event depends on its relatedness to the evolving situation model on each of the five situational dimensions. Probe-recognition studies show that the activation levels of concept decrease after a change in situational parameters (e.g., a shift in time, space, participant, or goal structure). For example, people recognize the word ‘checked’ more quickly after having read ‘He checked his watch. A moment later, the doorbell rang’ than after ‘He checked his watch. An hour later, the doorbell rang.’ Thus, the ‘here and now’ of the narrated situation tends to be more accessible to comprehenders than other information. This, of course, mimics our everyday interaction with the world. Analyses of memory-retrieval data shows that people tend to store events together in long-term memory based on the time and location at which they occur, the participants they involve, whether they are causally related and whether or not they are part of the same goal-plan structure. There is evidence that
Skinner, Burrhus Frederick (1904–90) memory retrieval is influenced by a combination of the link strengths between events on the five situational dimensions. It has furthermore been shown that people’s long-term memory for text typically reflects the situation that was described; memory representations for the discourse itself are much less resistant to decay or interference.
12. Computational Models Kintsch (1998) has developed a computational model of text comprehension, the construction-integration model (see also Text Comprehension: Models in Psychology). In this model, a network consisting of nodes representing the surface structure, the textbase, and the situation models is successively constructed and integrated in a sequence of cycles. The construction of the network is done by hand. The nodes are propositions (or surface structure elements) and the links between them can be based on a variety of criteria, such as argument overlap, causal relatedness, or other aspects of situational relatedness. Integration occurs computationally by way of a constraint-satisfaction mechanism. Simulations using the construction– integration model have been most successful in (quantitatively) predicting text recall. In addition, the model has been shown to provide qualitative fit with a variety of findings in text comprehension, such as anaphoric resolution and sentence recognition. Similar models have been proposed by other researchers. Like the construction-integration model, these models include aspects of situation-model construction. However, there currently exists no full-fledged computational model of situation-model construction.
13. Beyond Language Situation models have significance beyond the domain of text comprehension. Researchers are beginning to apply this concept to other domains of cognitive psychology, such as the comprehension of visual media (e.g., movies) and autobiographical memory. In the first case, situation models are acquired vicariously, as in language comprehension, but, in part, on nonlinguistic visual and auditory information. In the second case, they are acquired via direct experience. The question of whether and how the mode of acquisition affects the nature of situation models is a fruitful one. See also: Concept Learning and Representation: Models; Figurative Thought and Figurative Language, Cognitive Psychology of; Knowledge Activation in Text Comprehension and Problem Solving, Psychology of; Language and Thought: The Modern Whorfian Hypothesis; Literary Texts: Comprehension and Memory; Mental Models, Psychology of; Mental
Representations, Psychology of; Narrative Comprehension, Psychology of; Reasoning with Mental Models; Sentence Comprehension, Psychology of
Bibliography Barsalou L W 1999 Perceptual symbol systems. Behaioral and Brain Sciences 22: 577–660 Craik K 1943 The Nature of Explanation. Cambridge University Press, Cambridge, UK Glenberg A M 1997 What memory is for. Behaioral and Brain Sciences 20: 1–19 Graesser A C, Singer M, Trabasso T 1994 Constructing inferences during narrative text comprehension. Psychological Reiew 101: 371–95 Johnson-Laird P N 1983 Mental Models: Towards a Cognitie Science of Language, Inference, and Consciousness. Harvard University Press, Cambridge, MA Kintsch W 1998 Comprehension: A Paradigm for Cognition. Cambridge University Press, Cambridge, MA Kripke S 1963 Semantical considerations on modal logics. Acta Philosophica Fennica 16: 83–94 Minsky M 1975 A framework for representing knowledge. In: Winston P H (ed.), The Psychology of Computer Vision. McGraw-Hill, New York, pp. 211–77 Mu$ nte T F, Schiltz K, Kutas M 1998 When temporal terms belie conceptual order. Nature 395: 71–3 Sanford A J, Garrod S C 1998 The role of scenario mapping in text comprehension. Discourse Processes 26: 159–90 Schank R C, Abelson R 1977 Scripts, Plans, Goals and Understanding: An Inquiry into Human Knowledge Structures. Erlbaum, Hillsdale, NJ van Dijk T A 1999 Context models in discourse processing. In: van Oostendorp H, Goldman S R (eds.), The Construction of Mental Representations During Reading. Erlbaum, Mahwah, NJ, pp. 123–48 van Dijk T A, Kintsch W 1983 Strategies in Discourse Comprehension. Academic, New York Zwaan R A, Radvansky G A 1998 Situation models in language comprehension and memory. Psychological Bulletin 123: 162–85
R. A. Zwaan
Skinner, Burrhus Frederick (1904–90) After J. B. Watson, the founder of the behaviorist movement, Skinner has been the most influential, and also the most controversial figure of behaviorism. His contributions to behavioral sciences are manifold: he designed original laboratory techniques for the study of animal and human behavior, which he put to work to produce new empirical data in the field of learning, and to develop a theory of operant behavior that led him eventually to a general psychological theory; he further elaborated the behaviorist approach, both refining and extending it, in a version of behavioral science which he labeled radical behaviorism; he 14141
Skinner, Burrhus Frederick (1904–90) formulated seminal proposals for applied fields such as education and therapy and pioneered in machine assisted learning; finally, building upon his conception of the causation of behavior, he ventured into social philosophy, questioning the traditional view of human nature and of the relation of humans to their physical and social environment. This part of his work has been the main source of sometimes violent controversy.
main concern was obviously with humans, as evidenced by his literary writings in the field of social philosophy—namely the utopian novel Walden Two (1948) and the essay, Beyond Freedom and Dignity (1971)—as well as by his theoretical endeavors to account for human behavior (Science and Human Behaior 1953) and for verbal behavior (Verbal Behaior 1957).
1. Biographical Landmarks
2. Operant Conditioning and the Skinner Box
Skinner was born on March 20, 1904 in Susquehanna (Pennsylvania, USA) in a middle-class family and experienced the usual childhood and adolescence of provincial American life. He attended Hamilton College, which was not a particularly stimulant institution to him. He was first attracted by a literary career, which he soon gave up, after traveling to Europe. He turned to psychology, and was admitted at Harvard in 1928. He obtained his Ph.D. in 1931, with a theoretical thesis on the concept of reflex—a first landmark in his reflections on the causation of behavior, an issue he was to pursue throughout his scientific career. He stayed at Harvard five more years, beneficiary of an enviable fellowship, affiliated to the physiology laboratory headed by Crozier. In 1936, he was appointed professor at the University of Minnesota where he developed his conditioning chamber for the study of operant behavior in animals—which was to be known as the Skinner box—and wrote his first book The Behaior of Organisms (1938). In 1945, he moved to Indiana University, as Chairman of the Department of Psychology. In 1948, he was offered the prestigious Edgar Pierce Professorship in Psychology at Harvard University, where he was to stay until his death on August 18, 1990. Skinner received during his lifetime the highest national awards an American psychologist could receive, and he was praised as one of the most prominent psychologists of the century, in spite of harsh attacks against some of his ideas, from the most opposite sides of scientific and lay-people circles. Although most of Skinner’s laboratory research was carried out with animal subjects, mainly rats and pigeons, using the so-called operant procedure that will be described hereafter, essentially he was interested, not in animal behavior proper, but in behavior at large, and more specifically in human behavior. After the tradition of other experimental psychologists before him, such as Pavlov and Thorndike, and of most of his colleagues in the behaviorist school of thought, such as Hull, Tolman, or Guthrie, he resorted to animals as more accessible subjects than humans for basic studies on behavior, just as physiologists used to do quite successfully. The relevance of extrapolating from animals to humans is of course an important issue in psychology. However, Skinner’s
The operant conditioning chamber, often called the Skinner box, is a laboratory device derived from Thorndike’s puzzle box and from the mazes familiar to students of learning in rats by the time Skinner started his career. In its most common form, it consists of a closed space in which the animal moves freely; it is equipped with some object that the subject can manipulate easily—be it a lever for rats, or a small illuminated disk upon which pigeons can peck—and with a food dispenser for delivering calibrated quantities of food. By exploring spontaneously this particular environment, eventually with the help of the experimenter in shaping progressively its behavior, the subject will eventually discover the basic relation between a defined response—pressing the lever or pecking the key—and the presentation of a reinforcing stimulus—a small food reward. The basic relation here is between an operant response (i.e., a response instrumental to produce some subsequent event) and its consequence (i.e., the reinforcement), rather than between a stimulus and a response elicited by it, as in Pavlovian or respondent conditioning. This simple situation may be made more complex either by introducing so-called discriminative stimuli, the function of which is not to trigger the response at the manner of a reflex, but to set additional conditions under which the response will be reinforced, or by changing the basic one response-one reinforcement link to some more complicated contingencies, for instance requiring a given number of responses to one reinforcement, or the passing of some defined delay. A wide variety of schedules of reinforcement have been so studied, be it for their own sake as sources of information on the lawfulness of behavior (for instance modern research has applied optimization models borrowed from economics to the study of operant behavior), or as efficient tools for other purposes (such as the analysis of sensory functions in animal psychophysics, of the effects of drugs acting upon the Central Nervous System in experimental psychopharmacology, or of cognitive capacities). The operant technique presented two important features by the time Skinner developed it from the thirties to the fifties. It emphasized the study of individual subjects through a long period of time rather than groups of subjects for a few sessions, as
14142
Skinner, Burrhus Frederick (1904–90) used to be the case in maze studies and the like. This interest in individual behavior would favor later applications to human subjects in educational and clinical settings. Second, the operations involved soon were automatized by resorting to electromechanical circuits, to be replaced later by online computer control. This led to a level of efficiency and precision unprecedented in psychological research.
3. The Eolutionary Analogy Skinner captured the essence of operant behavior in the formula ‘control of behavior by its consequences,’ and very early he pointed to the analogy between the selection of the response by the subsequent event and the mechanism at work in biological evolution. An increasingly large part of his theoretical contributions were eventually devoted to elaborating the evolutionary analogy (Skinner 1987). The generalization of the selectionist model to behavior acquisition at the individual level, initially little more than a metaphoric figure, has recently gained credentials with the theses of neurobiologists, such as Changeux’s Generalised Darwinism (1983) or Edelman’s Neural Darwinism (1987), who both have substantiated in ontogeny selective processes previously reserved to phylogeny. One of the main tenets of Skinner’s theory converges with contemporary views in neurosciences. Skinner extended the selectionist explanation to cultural practices and achievements, joining some schools of thought in cultural anthropology and in the history of science, such as Karl Popper’s selectionist account of scientific hypotheses.
4. Radical Behaiorism As a behaviorist, Skinner viewed psychology as a branch of natural sciences, more explicitly as a branch of biology, which can deal with its subject matter using the same principles as in other fields of life sciences, be it with specific implementations as required by its particular level of analysis. Skinner defended a brand of behaviorism quite distinct from the dominant view that prevailed in the second quarter of the century. His radical behaviorism was opposed to methodological behaviorism. For most psychologists, defining their science as the science of behavior, after Watson’s recommendation, did not really mean that they had abandoned mental life as the main objective of their inquiry; rather, they had simply resigned themselves to study behavior, because they had to admit that they had no direct access to mental life. Such methodological behaviorism, in fact, remained basically dualistic. In contrast, radical behaviorism is definitely monist, and it rejects any distinction between what is called mental and behavioral. Skinner is, in this respect, closer to Watson’s view than
to the position of other influent neobehaviorists of his generation, although he developed a far more sophisticated view of behavior than Watson’s. For instance, he rejected the simplistic claim that thought is nothing more than subvocal language, and admitted that not all behavior is directly observable. Part of human behavior is obviously private, or covert, and raises difficult methodological problems of accessibility; but this is no reason to give it different status in a scientific analysis. Skinner vigorously denounced mentalism not so much because it refers to events that would occur in another space and be of a different substance than behavior, but because it offers pseudo-explanations which give the illusion of understanding what in fact remains to be accounted for. Mentalistic explanations, very common in daily life psychology, were still quite frequent in scientific psychology. For example, in the field of motivation, all sorts of behavior were assigned internal needs as causal agents: not only do we eat because we are hungry—a simple statement which, at any rate from a scientist’s point of view, requires qualification—but we exhibit aggressive behavior because of some aggression drive, we interact with social partners because of a need for affiliation, we work to successful and creative outcomes because of a need for achievement, etc. For Skinner, such explanations dispense us from looking for the variables really responsible for the behavior; these variables are more often than not to be found in the environment, and can be traced to the history of the individuals interacting with their environment, social and physical. Combining this epistemological conception and the results of his empirical research, Skinner developed a theory of human behavior emphasizing the determining role of environmental contingencies on human actions. However, along the lines of the evolutionary analogy, he suggested a model that would account equally well for novelty and creative behavior, as exhibited in art and science productions, and for stabilized, persistent habits adapted to unchanging conditions. Skinner devoted special attention to verbal behavior, the importance of which in the human species justified special treatment. He presented his book Verbal Behaior as ‘an essay in interpretation,’ proposing a functional analysis of the verbal exchanges composing the global episode of speaker-listener communication. In spite of the violent criticisms expressed by the linguist Chomsky (1959), Skinner’s analysis foreshadowed some aspects of the pragmatic approach adopted some years later by many psycholinguists, aware of the insufficiency of formal grammars to account for central features characterizing the use of language. He viewed verbal behavior as shaped by the linguistic community, and exerting a genuine type of control over an individual’s behavior, distinct from the action of the physical environment. A large part of human behavior is, in his terms, rule governed, 14143
Skinner, Burrhus Frederick (1904–90) that is, controlled by words, rather than shaped by contingencies, that is, through direct exposure to the physical world. Many behaviors can have one or the other origin: avoidance of fire flame can derive from direct experience with fire, or from warnings received during education. Once endowed with verbal behavior, individuals can use it to describe and anticipate their own behavior, or to develop it in its own right, as in literary composition. Were it not for the unfortunate use of the term rule, which makes for confusion with the word as used in formal linguistics and with the notion of coercive control, what Skinner was pointing to was akin to a distinction now familiar to contemporary psychology and neurosciences between topdown vs. bottom-up causation.
5. Education Skinner’s interests in applications covered three main areas: education, psychological treatment, and social practices at large. He dealt with the first two in a technical manner, developing principles and techniques for improving educational and therapeutic practices. His treatment of the third one is more akin to social philosophy than to scientific application, although he viewed it as very consistently rooted in his scientific thinking. At a time school education in the US was criticized for its deficiencies and for its inefficiency in competing with technological achievements of the Soviet Union, Skinner, as many other American scientists, inquired into the reasons of that state of affairs. Observing what was going on in any normal classroom, including in reputed schools, he concluded that it was violating blatantly most basic principles of learning derived from laboratory analysis of the learning process. Pupils and students were passively exposed to teachers’ monologues rather than actively producing behaviors followed by feedback; there was no attempt to adjust the teacher’s actions to individual level and rhythm of learning; negative evaluation based on mistakes and errors prevailed over positive evaluation pointing to progresses achieved; punitive controls, known to be poorly effective in shaping and maintaining complex behavior, was still widely used; general conditions and teaching practices were far from favorable to developing individual talents and creativity. Such criticisms had been made by others, but Skinner differed from them in the analysis of the causes and in the remedies proposed. He did not question the importance of endowing the students with basic knowledge and skills that they need if they are to engage in more complex and original activities. But he thought such skills could be mastered using more efficient methods than those currently in use in the classroom. This was the origin of teaching machines, a term that would raise strong objections on the ground that machines could not lead 14144
but to dehumanising teaching. In fact, Skinner’s idea has been implemented since then in computer assisted learning, which is now widely accepted—with little reference to the pioneering projects of the behaviorist psychologist. The device he designed in the 1950s appears quite primitive compared with modern computers: it was an electromechanical machine—adapted from a record player—built in such a way that it would present in a window to the student successive small frames of the material to be learned, each frame requiring an active answer from the learner. The latter could learn at his or her own individual rhythm, ideally with no or few errors, and eventually reach the end of the program, with the guarantee that the subject matter had been mastered from end to end. Good programs would make exams useless, if exams are just a way to control that the material has been covered and understood. Most important, student and teachers would not waste the few hours they could work together in tasks easily fulfilled using teaching devices; they could devote the time so spared to more constructive activities requiring direct human contact. In spite of numerous attacks in educational circles, Skinner’s project inspired many applications, such as programmed instruction in book form, until modern computers would offer the elegant solution we know today. It also contributed to the development of individualized teaching approaches that favor methods allowing students to learn at their own pace in an autonomous and active way. Teaching machines are but one aspect of Skinner’s contribution to education. In a number of his writings, including his utopian novel, Skinner (1968, 1978) did express his reflections on educational issues. He was concerned especially with the disproportion between the resources devoted to education and the poor outcomes; with the tendency to level down individual differences; with the increasing distance between the school environment and real life; with violence in schools, and other matters which remain crucial issues today, with little improvement.
6. Behaior Therapy Treatment of psychological disturbances was another field of application to which Skinner (1955, 1989) made influent contributions. By the middle of the twentieth century, psychopathology had elaborated refined descriptive systems of psychological disturbances and equally sophisticated explanatory models such as the psychoanalytic theory. Contrasting with these, methods for treatment were scarce and their results poor. Psychoanalytic treatment had limited indications and practical limitations. Rogers’s nondirective therapy, though quite popular, did not provide convincing results. Psychopharmacology was still in the air.
Skinner, Burrhus Frederick (1904–90) Skinner did not question the classical categorization of mental illnesses, nor did he propose miracle remedies. He simply suggested to look at them as disturbances in behavior, rather than alterations of hypothetical mental structures, such as the psychic apparatus appealed to by psychoanalysis, of which abnormal behavior would be but observable indicators, or symptoms. Consequently, he proposed to attempt to change undesirable behavior by acting directly upon it, rather than upon underlying structures supposedly responsible for it. This approach was not totally new: behavior therapy had its origins in John Broadus Watson’s attempts to treat fear in children by resorting to Pavlovian conditioning and in the theoretical work of some neobehaviorists aimed at transposing some psychoanalytical concepts into learning theory models. What Skinner added was his genuine theoretical elaboration, especially based on his antimentalist stand, and techniques of behavior modification drawn from the operant laboratory, supplementing Pavlovian techniques in use up to then. He also brought into the clinical field a sense of rigor transferred from the laboratory, perfectly compatible with the study of single cases. Skinner did not practice behavior therapy himself. His influence in the field was indirect, by stimulating pioneering research on psychotic patients and mentally deficient people. He gave a decisive impulse to the development of the behavioral approach to treatment, which soon acquired a major position in clinical psychology and counseling, and which eventually merged, somewhat paradoxically, with cognitively oriented practices into what are labeled behavioralcognitive therapies.
7. Social Philosophy Skinner’s social philosophy was based in a deep confidence that only science would help us in solving the problems we are facing in our modern societies. What is needed is a science of behavior, which in fact is now available, he thought, and could be applied if only we would be ready to abandon traditional views of human nature. He first expressed his ideas in the novel Walden Two (1948). Written two years after the end of the World War Two, the book describes a utopian community run after the principles derived from the psychological laboratory. It is by no means a totalitarian society, as some critics have claimed. Looked at retrospectively, it is surprisingly premonitory of social issues that are still largely unsolved half a century later. For instance, working schedules at Walden Two have been arranged so that all tasks needed for production of goods and good functioning of the community are distributed among members according to a credit system which results in an average amount of 24 hours per week, avoiding unemployment, abolishing any social discrimination between
manual and intellectual work, and leaving many free hours for leisure activities such as sports, arts, and scientific research. Emphasis is put on active practice rather than passive watching, on co-operation rather than competition. The community is not isolated culturally: cultural products from outside such as books or records are of course welcome, but radio programs are filtered to eliminate publicity. Education is active; the school building symbolically has no door separating it from the life and work community; there are no age classes, no humiliating ranking; all learn at their own rhythm, in whatever orientation they feel appropriate, throughout their life time. Women enjoy complete equality with men. Waste of natural resources is avoided. Special charges in the management of the community are strictly limited in time, eliminating any risk of political career. Similar themes, plus the frightening concerns with pollution, violence, uncontrolled population growth, nuclear weapons, and the like, are further elaborated in the essay Beyond Freedom and Dignity (1971) and a number of later articles. In an alarming tone, Skinner points to what he feels is the core of our inability to deal with these issues, that is our obstinacy in keeping a conception of human nature which scientific inquiry shows us to be wrong, and which bar any solution to the problems we are confronted with. We still stick to a view of humans as being the center of the universe, free and autonomous, dominating nature, while we are but one among many elements of nature. As a species, we are the product of biological evolution; as cultural groups, the result of our history; and as individuals the outcome of our interactions with the environment. Because we fail to admit this dependency, and draw the consequences of it, we might put in danger our own future. Freedom, autonomy, and merit are no absolute values: they were forged throughout history, and more often than not they are used to disguise insidious controls, the mechanisms of which should be elucidated if we want to develop counter-controls eventually allowing for the survival of our species. Skinner viewed these various facets of his work as closely related, making for a highly consistent theory of human behavior, in which the critical analysis of social processes in modern society was deeply rooted in the experimental analysis of the behavior of animal subjects in the laboratory. So global an ambition has been criticised, and clearly various aspects of his contribution did not have the same fate. If the operant technique is now a widely used procedure to many purposes in experimental research in psychology and related fields, if his early attempts to build teaching machines appear now as ancestors of computer assisted learning and teaching, if a number of principles of contingencies analysis are now currently put in practice in behavior therapies, radical behaviorism has been seriously questioned and even shaken by the rise of the cognitivist approach in psychology, while 14145
Skinner, Burrhus Frederick (1904–90) Skinner’s social philosophy has been attacked from different fronts both on ideological and scientific grounds. As most great theory builders of the twentieth century in psychology, from Sigmund Freud and Watson to Jean Piaget, Skinner may be blamed for having reduced the explanation of human nature to a very limited set of concepts and findings, namely those he had forged and observed in his own restricted field of research and reflection, ignoring even other concepts and facts in neighboring fields of psychology, leaving alone of other sciences. It is clear that Skinner has made no attempt at integrating, for example, contributions of developmental or of social psychology, nor those of sociology, cultural anthropology, or linguistics. Such neglects might have been deliberate, legitimated by the will to concentrate on what were, in Skinner’s mind, essential points left out by other branches of psychology or other sciences dealing with human societies. However, they might appear as sectarianism to those who favor an integrative and plurisdisciplinary approach to the complex objects of human sciences. It cannot be decided whether his influence would have been larger or smaller had he adopted a less exclusive stand. See also: Autonomic Classical and Operant Conditioning; Behavior Therapy: Psychiatric Aspects; Behavior Therapy: Psychological Perspectives; Behaviorism; Behaviorism, History of; Conditioning and Habit Formation, Psychology of; Darwinism: Social; Educational Learning Theory; Educational Philosophy: Historical Perspectives; Evolutionary Epistemology; Freud, Sigmund (1856–1939); Lashley, Karl Spencer (1890–1958); Mental Health and Normality; Operant Conditioning and Clinical Psychology; Pavlov, Ivan Petrovich (1849–1936); Piaget, Jean (1896–1980); Psychological Treatment, Effectiveness of; Psychological Treatments, Empirically Supported; Psychology: Historical and Cultural Perspectives; Thorndike, Edward Lee (1874–1949); Utopias: Social; Watson, John Broadus (1878–1958)
Bibliography Bjork D W 1997 B. F. Skinner, A Life. American Psychological Association, Washington, DC Changeux J-P 1983 L’Homme neuronal. Fayard, Paris (The Neuronal Man) Chomsky N 1959 Review of Skinner B F, Verbal behavior. Language 4: 16–49 Edelman G M 1987 Neural Darwinism: The Theory of Neuronal Group Selection. Basic Books, New York Modgil S, Modgil C (eds.) 1987 B. F. Skinner: Consensus and Controersy. Falmer, New York Richelle M 1993 B. F. Skinner, A Reappraisal. Erlbaum, Hove, London
14146
Roales-Nieto J G, Luciano Soriano M C, Pe! rez Alvarez M (eds.) 1992 Vigencia de la Obra de Skinner. Universidad Granada Press, Granada (Robustness of Skinner’s Work) Skinner B F 1938 The Behaior of Organisms. Appleton Century Crofts, New York Skinner B F 1948 Walden Two. Macmillan, New York Skinner B F 1953 Science and Human Behaior. Macmillan, New York Skinner B F 1955 What is psychotic behavior. In: Gildea F (ed.) Theory and Treatment of the Psychoses: Some Newer Aspects. Washington University Studies, St Louis, MO, pp. 77–99 Skinner B F 1957 Verbal Behaior. Appleton Century Crofts, New York Skinner B F 1961 Cumulatie Record. Appleton Century Crofts, New York Skinner B F 1968 The Technology of Teaching. Appleton Century Crofts, New York Skinner B F 1971 Beyond Freedom and Dignity, 1st edn. Knopf, New York Skinner B F 1978 Reflections on Behaiorism and Society. Prentice-Hall, Englewood Cliffs, NJ Skinner B F 1987 Upon Further Reflection. Prentice Hall, Englewood Cliffs, NJ Skinner B F 1989 Recent Issues in the Analysis of Behaior. Merrill, Columbus, OH
M. N. Richelle
Slavery as Social Institution Slavery is the most extreme form of the relations of domination. It has existed, at some time, in most parts of the world and at all levels of social development. This article examines five aspects of the institution: Its distinguishing features; the means by which persons were enslaved; the means by which owners acquired them; the treatment and condition of slaves; and manumission, or the release from slavery.
1. The Distinctie Features of Slaery The traditional, and still conventional, approach is to define slavery in legal–economic terms, typically as ‘the status or condition of a person over whom any or all the powers attaching to the right of ownership are exercised’ (League of Nations 1938, Vol. 6). In this view, the slave is, quintessentially, a human chattel. This definition is problematic because it describes adequately mainly Western and modern, capitalistic systems of slavery. In many nonWestern parts of the world, several categories of persons who were clearly not slaves, such as unmarried women, concubines, debt bondsmen, indentured servants, sometimes serfs, and occasionally children, were bought and sold. Conversely, in many slave-holding societies certain categories of slaves, such as those born in the household, were not treated as chattels.
Slaery as Social Institution Slavery is a relation of domination that is distinctive in three respects. First, the power of the master was usually total, if not in law, almost always in practice. Violence was the basis of this power. Even where laws forbade the gratuitous killing of slaves, it was rare for masters to be prosecuted for murdering them, due to the universally recognized right of masters to punish their slaves, and to severe constraints placed on slaves in giving evidence in courts of law against their masters, or free persons generally. The totality of the master’s claims and powers in them meant that slaves could have no claims or powers in other persons or things, except with the master’s permission. A major consequence of this was that slaves had no custodial claims in their children; they were genealogical isolates, lacking all recognized rights of ancestry and descent. From this flowed the hereditary nature of their condition. Another distinctive consequence of the master’s total power is the fact that slaves were often treated as their surrogates, and hence could perform functions for them as if they were legally present, a valuable trait in premodern societies with advanced commodity production and longdistance trading, such as ancient Rome, where laws of agency, though badly needed, were nonexistent or poorly developed. Second, slaves were universally considered as outsiders, this being the major difference between them and serfs. They were natally alienated persons, deracinated in the act of their, or their ancestors’, enslavement, who were held not to belong to the societies in which they lived, even if they were born there. They lacked all legal or recognized status as independent members of a community. In kin-based societies, this was expressed in their definition as kinless persons; in more advanced, state-based societies, they lacked all claims and rights of citizenship. Because they belonged only to their master, they could not belong to the community; because they were bonded only to their master’s household, they could share no recognized bond of loyalty and love with the community at large. The most ancient words for slaves in the IndoEuropean and several other families of languages translate to mean, ‘those who do not belong,’ or ‘not among the beloved,’ in contrast with free members of the community, who were ‘among the beloved,’ and ‘those who belonged.’ Third, slaves were everywhere considered to be dishonored persons. They had no honor that a nonslave person need respect. Masters could violate all aspects of their slaves’ lives with impunity, including raping them. In most slave-holding societies, injuries against slaves by third parties were prosecuted, if at all, as injuries against the person and honor of the master. Where an honor-price or wergild existed, as in Anglo-Saxon Britain and other Germanic lands, its payment usually went to the master rather than to the injured slave. Universally, slavery was considered the most extreme form of degradation, so
much so that the slave’s very humanity was often in question. For all these reasons, there was a general tendency to conceive of slaves symbolically as socially dead persons. Their social death was often represented symbolically in ritual signs and acts of debasement, death and mourning: In clothing, hairstyles, naming practices, and other rituals of obeisance and nonbeing.
2. The Modes of Enslaement Free persons became slaves in one of eight ways: capture in warfare; kidnapping; through tribute and taxation; indebtedness; punishment for crimes; abandonment and sale of children; self-enslavement; and birth. Capture in warfare is generally considered to have been the most important means of acquiring slaves, but this was true mainly of simpler, small-scale societies, and of certain volatile periods among politically centralized groups. Among even moderately advanced premodern societies, the logistics of warfare often made captivity a cumbersome and costly means of enslaving free persons. Kidnapping differed from captivity in warfare mainly in the fact that enslavement was its main or sole objective, and that it was usually a private act rather than the by-product of communal conflict. Other than birth, kidnapping in the forms of piracy and abduction was perhaps the main form of enslavement in the ancient Near East and the Mediterranean during Greek and Roman times; and this was true also of free persons who were enslaved in the transSaharan and transatlantic slave trades. Debt bondage, which was common in ancient Greece up to the end of the seventh century BC, the ancient Near East, and in South East Asia down to the twentieth century, could sometimes descend into real slavery, although nearly all societies in which it was practiced distinguished between the two institutions in at least three respects: Debt-bondage was nonhereditary; bondsmen remained members of their community, however diminished; and they maintained certain basic rights, both in relation to the bondholder and to their spouses and children. Punishment for crimes was a major means of enslavement in small, kin-based societies; China, and to a lesser extent, Korea, were the only advanced societies in which it remained the primary way of becoming a slave. Nonetheless, it persisted as a minor means of enslavement in all slaveholding societies, and became important historically in Europe as the antecedent of imprisonment for the punishment of crimes. The enslavement of foundlings was common in all politically centralized premodern societies, though rarely found in small-scale slaveholding communities. It was the humane alternative to infanticide and was especially important in China, India, European antiquity, and medieval Europe. It has been argued that 14147
Slaery as Social Institution it ranked second to birth as a source of slaves in ancient Rome from as early as the first century CE until the end of the Western empire. Self-enslavement was rare and was often the consequence of extreme penury or catastrophic loss. In nearly all slave-holding societies where the institution was of any significance, birth rapidly became the most important means by which persons became slaves, and by which slaves were acquired. Contrary to a common misconception, this was true even of slave societies in which the slave population did not reproduce itself naturally. The fact that births failed to compensate for deaths, or to meet the increased demand for slaves—which was true of most of the slave societies of the New World up to the last decades of the eighteenth century—does not mean that birth did not remain the main source of slaves. However, the role of birth as a source of slaves was strongly mediated by the rules of status inheritance, which took account of the complications caused by mixed unions between slaves and free persons. There were four main rules. (a) The child’s status was determined by the mother’s status only, regardless of the father’s status. This was true of most modern and premodern Western slave societies, and of nearly all premodern nonWestern groups with matrilineal rules of descent. (b) Status was determined by the father only, regardless of the mother’s status. This unusual pattern was found mainly among certain rigidly patrilineal groups, especially in Africa, where it was the practice among groups such as the Migiurtini Somali, the Margi of northern Nigeria, and among certain Ibo tribes. The practice, however, was not unknown in the West. It was the custom in Homeric Greece and was the norm during the seventeenth century in a few of the North American colonies such as Maryland and Virginia, and in South Africa and the French Antilles up to the 1680s. (c) Status was determined by the principle of deterior condicio, that is, by the mother or father, whoever had the lower status. This was the harshest inheritance rule and was the practice in China from the period of the Han dynasties up to the reforms of the thirteenth and fourteenth centuries. It found its most extreme application in Korea, where it was the dominant mode down to 1731. The rule also applied in Visigothic Spain, and medieval and early modern Tuscany. The only known case in the New World was that of South Carolina in the early eighteenth century. (d) The fourth principle of slave inheritance, that of melior condicio, is in direct contrast with the last mentioned, in that the child inherited the status of the free parent, whatever the gender of the parent, as long as the father acknowledged his paternity. This is the earliest known rule of slave inheritance and may have been the most widely distributed. It was the norm in the ancient Near East and, with the exception of the Tuareg, it became the practice among nearly all 14148
Islamic societies. The rule was supported among Muslims by another Koranic prescription and practice: The injunction that a slave woman was to be freed, along with her children, as soon as she bore a son for her master. The only known cases in Europe of the melior condicio rule both emerged during the thirteenth century. In Sweden, it was codified in the laws of Ostergotland and Svealand as part of a general pattern of reforms. In Spain, religion was the decisive factor in the appearance of a modified version of the rule. Baptized children of a Saracen or Jewish owned slave and a Christian were immediately freed. Throughout Latin America, although the legal rule was of the first type—the children of slave women were to become slaves—the widespread custom of concubinage with slave women, coupled with the tendency to recognize and manumit the offspring of such unions, meant that, in practice, a modified version of the melior condicio rule prevailed where the free person was the father, which was usually the case with mixed unions.
3. The Acquisition of Slaes Slaves not born to, or inherited by, an owner were acquired mainly through external and internal trading systems; as part of bride and dowry payments; as money; and as gifts. There were five major slave trading systems in world history. The Indian Ocean trade was the oldest, with records dating back to 1580 BC, and persisted down to the twentieth century AD. Slaves from subSaharan Africa were transported to the Middle and Near East as well as Southern Europe. It has been estimated that between the years 800 AD and 1800 approximately 3 million slaves were traded on this route, and over two million were traded during the nineteenth century. The Black Sea and Mediterranean slave trade supplied slaves to the ancient European empires and flourished from the seventh century BC through the end of the Middle Ages. Over a quarter of a million slaves may have been traded in this system during the first century of our era. The European slave trade prospered from the early ninth century AD to the middle of the twelfth, and was dominated by the Vikings. One of the two main trading routes ran westward across the North Sea; the other ran eastward. Celtic peoples, especially the Irish, were raided and sold in Scandinavia. Most slaves traded on the eastern routes were of Slavic ancestry. It was the Viking raiding, and wide distribution, of Slavic slaves throughout Europe that accounts for the common linguistic root of the term ‘slave’ in all major European languages. The transSaharan trade has persisted from the midseventh century AD down to the twentieth and involved the trading of subSaharan Africans throughout North Africa and parts of Europe. It has been
Slaery as Social Institution estimated that some 6.85 million persons were traded in this system up to the end of the nineteenth century. Although it declined substantially during the twentieth century, largely under European pressure, significant numbers of Africans are still being traded in Sudan and Mauritania. The transatlantic slave trade was the largest in size and certainly the most extensive of all these systems. The most recent evidence suggests that between the years 1500 and 1870, some 11 million Africans were taken to the Americas. Of the 10.5 million who were forced from Africa between 1650 and 1870, 8.9 million survived the crossing. Although all the maritime West European nations engaged in this trade, the main traders were the British, Portuguese, and French. Four regions account for 80 percent of all slaves going to the New World: the Gold Coast (Ghana), the Bights of Benin and Biafra, and West-Central Africa. Forty percent of all slaves landed in Brazil; and 47 percent in the Caribbean. Although only 7 percent of all slaves who left Africa landed in North America, by 1810 the United States had, nonetheless, one of the largest slave populations due to its unusual success in the reproduction of its slave populations. For the entire three and a half centuries of the Atlantic slave trade, approximately 15 percent of those forced from Africa died on the Atlantic crossing (some 1.5 million) with losses averaging between 6000 and 8000 per year during the peak period between the years 1760 and 1810.
4. The Treatment of Slaes It is difficult to generalize about the treatment of slaves, since this not only varied considerably between societies but also within them. There was no simple correlation of favorable factors. Furthermore, the same factor may operate in favor of slaves in one situation, but against them in the next. Thus, in many small, kin-based societies, slaves were relatively well treated and regarded as junior members of the master’s family, but could nonetheless be sacrificed brutally on special occasions. In advanced premodern societies such as Greece and Rome, as well as modern slave societies such as Brazil, slaves in the mines or latifundia suffered horribly short lives, while skilled urban slaves were often virtually free and sometimes even pampered. In the Caribbean, the provision ground system, by which slaves supported themselves, led to high levels of malnutrition compared to the US South, where masters provided nearly all the slaves’ provision. Nonetheless, Caribbean slaves cherished the provision ground system for the periods of selfdetermination and escape from the master’s direct control that it offered. In general, the most important factors influencing the condition of slaves were the uses to which they were put, their mode of acquisition, their location—
whether urban or rural—absenteeism of owners, proximity to the master, and the personal characteristics of the slaves. Slaves were acquired for purely economic, prestige, political, administrative, sexual, and ritual purposes. Slaves who worked in mines or in gangs on highly organized farming systems were often far worse off than those who were employed in some skilled craft in urban regions, especially where the latter were allowed to hire themselves out independently. Slaves acquired for military or administrative purposes, as was often the case in the Islamic world— the Janissaries and Mameluks being the classic examples—were clearly at an advantage when compared with lowly field hands or concubines. Newly bought slaves, especially those who grew up as free persons and were new to their masters’ society, usually led more wretched lives than those born to their owners. High levels of absenteeism among owners—which was true of the owners of slave latifundia in ancient Rome as well as the Caribbean slave societies and some parts of Latin America—often meant ill-usage by managers and overseers paid on a commission basis. Proximity to the master cut both ways with respect to the treatment of slaves. Slaves in the household were usually materially better off, and in some cases close ties developed between masters and these slaves, such as those between masters and their former nannies, or with a favored concubine. However, proximity meant more sustained and direct supervision, which might easily become brutal. Ethnic and somatic or perceived racial differences between masters and slaves operated in complex ways. Intra-ethnic slavery was uncommon in world history, although by no means nonexistent, while that between peoples of different perceived ‘racial’ groups was frequent. The common view that New World slavery was distinctive in that masters and slaves belonged to different perceived races is incorrect. Where there were somatic differences, the treatment of the slave depended on how these differences were perceived and, independently of this, how attractive the slave was in the eyes of the master. Scandinavian women were prized in the slave harems of many Islamic Sultans, but so were attractive Ethiopian and subSaharan women. Furthermore, in Muslim India and eighteenth-century England and France, dark-skinned slaves were the most favored, especially as young pages. In the New World, on the other hand, mulatto and other light-skinned female slaves were often better treated than their more African-looking counterparts. Two other factors should be mentioned in considering the treatment of slaves: Laws and religion. Slightly more than a half of all slave societies on which data exist had some kind of slave laws, in some cases elaborate servile codes, the oldest known being those of ancient Mesopotamia. Slightly less than a half had none. Laws did make a difference; a much higher proportion of societies without any slave codes tended to treat slaves harshly. Nonetheless, the effectiveness 14149
Slaery as Social Institution of laws were mediated by other factors such as the relative size of the slave population, and religion. The degree to which religion, especially Islam and Christianity, influenced the treatment of slaves is a controversial subject. Islam had explicit injunctions for the treatment of slaves, and these were sometimes influential, especially those relating to manumission. Although racism and strong preference for light complexion were found throughout the Islamic lands, it is nonetheless the case that Islam as a creed has been more assimilative and universalist than any other of the world religions, and has rarely been implicated in egregiously racist movements similar to those that have tarnished the history of Christianity such as apartheid, the Ku Klux Klan, Southern Christianity during the era of segregation, and the ethnic cleansing of Eastern Europe. However, Islam never developed any movement for abolition, and, in general, strongly supported the institution of slavery, especially as a means of winning converts. For most of its history up to the end of the eighteenth century, Christianity simply took slavery for granted, had little to say about the treatment of slaves, and generally urged slaves to obey their masters. This changed radically with the rise of evangelical Christianity in the late eighteenth and early nineteenth centuries, during which it played a critical role in the abolition of both the slave trade and of slavery in Europe and the Americas. Throughout the world, Christianity appealed strongly to slaves and ex-slave populations, and in the Americas their descendants are among the most devout Christians. While Christianity may have had conservative influences on converted slaves, it is also the case that nearly all the major slave revolts from the late eighteenth century were influenced strongly by rebel leaders, such as Daddy Sharp in Jamaica and Nat Turner in America, who interpreted the faith in radical terms, or by leaders of syncretic Afro-Christian religions.
5. Manumission as an Integral Element of Slaery With a few notable exceptions, manumission, the release from slavery, was an integral and necessary element of the institution wherever it became important. The reason for this is that it solved the incentive problem implicit in slavery. The promise of redemption proved to be the most important way of motivating the slaves to work assiduously on their masters’ behalf. Slaves were manumitted in a wide variety of ways, the most important being: Self-purchase or purchase by others, usually free relatives; through the postmortem or testamentary mode; through cohabitation or sexual relations; through adoption; through political means or by the state; and by various ritual or 14150
sacral means. Self-purchase or purchase by relatives or friends of the slave into freedom was by far the most important means in the advanced slave economies of Greece and Rome, and in the modern capitalistic slave regimes. However, it was not the most widespread in other kinds of slave systems, and in parts of the world such as Africa it was uncommon. Post-mortem manumission by wills and other means was common in Islamic lands and in many parts of Africa. This form of manumission was usually intimately linked to religious practices and with expectations of religious returns for what was often defined as an act of piety. As indicated earlier, in many slave societies slave concubines sometimes gained their freedom for themselves and their children from the master. Manumission by the state for acts of heroism or for military action was an important, though episodic, form of manumission not only in the ancient world, but in many New World slave societies. Thousands of slaves gained their freedom in this way, not only in Latin America, but also in North America during the American war of independence and in wars against Spain in the southern USA. Manumission by adoption was unusual, but in certain societies, such as ancient Rome, it constituted the most complete form of release from slavery. Slaves were sometimes manumitted for ritual or religious reasons, or on special celebratory occasions. Although thousands of slaves were manumitted at Delphi, ostensibly by being sold to Apollo, such manumissions had become merely legal formalism by the second century BC, although the practice may have harked back to an earlier era when they were genuinely religious in character. In other societies, religious or ritual manumissions were often substitutes for earlier practices in which slaves were sacrificed. Since manumission meant the negation of slavery, for many peoples, freeing slaves was symbolically identical to killing them. Such practices were common in some parts of Africa and the Pacific islands, and among some indigenous tribes of the Northwest Coast of America slaves were either killed or given away in potlatch ceremonies. There is an extension of this primitive symbolic logic in Christianity, where Christ’s sacrificial death is interpreted as a substitute for the redemption of mankind from enslavement to sin and eternal death, ‘redemption’ (from Latin redemptio) literally meaning ‘to purchase someone out of slavery.’ In all slave societies, certain categories of slave were more likely to be manumitted than others. The most important factors explaining the incidence of manumission were: Gender, the status of parents, age, skill, residence and location, the means of acquisition and, where relevant, skin color. These factors are similar to those influencing the treatment of slaves and will not be discussed further. They were also important in explaining varying rates between societies. Thus societies with relatively higher proportions of skilled slaves, a greater location
Slaery as Social Institution of slaves in urban areas, higher ratios of female to male slaves, and higher rates of concubinage between masters and slaves were more likely to have higher rates of manumission than those with lower levels of these attributes. Added to this is another critical variable: The availability of slaves, either internally or externally, to replace manumitted slaves. As long as such sources existed and the replacement value of the manumitted slave was less than the price of manumission, it suited slave-owners to manumit slaves, especially when the manumitted slaves were nearing, or had already reached, the end of their useful life. However, on rare occasions the supply of slaves was cut off when demand remained high or was on the increase. This always resulted in very low rates of manumission. The most striking such case in the history of slavery was the US South, where the rise of cotton-based, capitalistic slavery in the early nineteenth century came within a few years of the termination of the Atlantic slave trade to America. Planters responded by reducing the manumission rate to near zero. A similar situation, though not as extreme, developed in the Spanish islands of the Caribbean during the nineteenth century, when the plantation system developed in the context of British enforcement of the abolition of the slave trade in the region. The result was that previously high rates of manumission plunged to very low rates. A final point concerns the status of freed persons. This varied considerably across slave societies and bore little relation to the rate of manumission. Thus manumission rates were high in the Dutch colonies of Curac: ao and in nineteenth-century Louisiana, but the condition of freedmen was wretched. Conversely, in the British Caribbean, where manumission rates were low, the condition of freedmen was relatively good, some groups achieving full civil liberties before the end of slavery. The main factor explaining the difference in treatment was the availability of economic opportunities for freedmen, and the extent to which the dominant planter class needed them as allies against the slaves. In the Caribbean, the small proportion of slaveholders and Europeans and the existence of a vast and rebellious slave population gave much political leverage to the small but important freed population. No such conditions existed in the United States, where the free and white population greatly outnumbered the slaves, and conditions for rebellion were severely restricted. In the ancient world, ethnic barriers meant generally low status and few opportunities for the manumitted in the Greek states, in contrast with Rome, where cultural and economic factors favored the growth and prosperity of a large freedmen class, a class that eventually came to dominate Rome demographically and culturally, with major implications for Western civilization. See also: Caribbean: Sociocultural Aspects; Inequality; Inequality: Comparative Aspects; Property
Rights; Slavery: Comparative Aspects; Slaves\ Slavery, History of; Subaltern History; West Africa: Sociocultural Aspects
Bibliography Blackburn R 1997 The Making of New World Slaery. Verso, London Cohen D W, Greene J P (eds.) 1972 Neither Slae Nor Free. Johns Hopkins University Press, Baltimore, MD Davis D B 1966 The Problem of Slaery in Western Culture. Cornell University Press, Ithaca, NY Drescher S, Engerman S L (eds.) 1998 A Historical Guide to World Slaery. Oxford University Press, New York Engerman S 1973 Some considerations relating to property rights in man. Journal of Economic History 33: 43–65 Engerman S L (ed.) 1999 Terms of Labor: Slaery, Serfdom, and Free Labor. Stanford University Press, Stanford, CA Engerman S, Genovese E (eds.) 1975 Race and Slaery in the Western Hemisphere. Princeton University Press, Princeton, NJ Eltis D, Richardson D (eds.) 1997 Routes to Slaery. Frank Cass, London Findlay R 1975 Slavery, incentives and manumission: A theoretical model. The Journal of Political Economy 83(5): 923–34 Finley M I (ed.) 1960 Slaery in Classical Antiquity: Views and Controersies. Heffer, Cambridge, UK Fogel R W 1989 Without Consent or Contract: The Rise and Fall of American Slaery. Norton, New York Garlan Y 1995 Les Esclaes en Greece Ancienne. Editions de la Decouvert, Paris Kirschenbaum A 1987 Sons, Slaes and Freedmen in Roman Commerce. Catholic University Press, Washington, DC Landers J (ed.) 1996 Against the odds. Slaery and Abolition, Special issue 17(1) League of Nations 1938 Report to the League of Nations Adisory Committee of Experts on Slaery, Vol. 6. League of Nations, Geneva, Switzerland Lovejoy P E 2000 Transformations in Slaery: A History of Slaery in Africa. Cambridge University Press, New York Meillassoux C 1991 The Anthropology of Slaery: The Womb of Iron and Gold, trans. A Dasnois. Athlone, London Miers S, Kopytoff I (eds.) 1977 Slaery in Africa. University of Wisconsin Press, Madison, WI Miller J C 1993 Slaery and Slaing in World History: A Bibliography, 1900–1991. Kraus International, Millwood, NY Patterson O 1967 The Sociology of Slaery: Jamaica, 1655–1838. McGibbon & Kee, London Patterson O 1982 Slaery and Social Death. Harvard University Press, Cambridge, MA de Queiros Mattoso K M 1986 To Be a Slae in Brazil, 1550–1888, trans. A Goldhammer. Rutgers University Press, New Brunswick, NJ Reid A (ed.) 1983 Slaery, Bondage and Dependency in Southeast Asia. University of Queensland Press, St. Lucia, Queensland Rodriguez J P (ed.) 1977 The Historical Encyclopedia of World Slaery. ABC-CLIO, Santa Barbara, CA Shepherd V, Beckles H (eds.) 2000 Caribbean Slaery in the Atlantic World. Ian Randle Publishers, Kingston, Jamaica
14151
Slaery as Social Institution Watson J (ed.) 1980 Asian and African Systems of Slaery. Blackwell, Oxford, UK
O. Patterson
Slavery: Comparative Aspects In the comparative study of slavery it is important to distinguish between slave holding societies and largescale, or what Moses Finley called ‘genuine,’ slave societies (Finley 1968). The former refers to any society in which slavery exists as a recognized institution, regardless of its structural significance. Genuine slave societies belong to that subset of slave holding societies in which important groups and social processes become heavily dependent on the institution. The institution of slavery goes back to the dawn of human history. It remained important down to the late nineteenth century, and persisted as a significant mode of labor exploitation in some Islamic lands as late as the second half of the twentieth century. Remnants of it are still to be found in the twenty-first century in a few areas. Yet it was only in a minority of cases that it metastasized socially into genuine slave societies, though in far more than the five cases erroneously claimed by Keith Hopkins (1978). Thus, the institution existed throughout the ancient Mediterranean, but only in Greece and Rome (and, possibly, Carthage) did genuine slave societies emerge. It was found in every advanced precapitalist society of Asia, but only in Korea did it develop into large-scale slavery. All Islamic societies had slavery, but in only a few did there emerge structural dependence on the institution. In all, there were approximately 50 cases of large-scale slavery in the precapitalist world. With the rise of the modern world, slavery became the basis of a brutally efficient variant of capitalism. There were at least 40 such cases, counting the Caribbean slave colonies as separate units (for the most complete list, see Patterson 1982, App. C) This article examines two sets of problems. The first concerns those factors associated with the presence of institutionalized slaveholding. The critical question here is: why is the institution present in some societies yet not in other, apparently similar, ones? The second set of problems begins with the assumption that slavery exists, and attempts to account for the growth in significance of slavery. More specifically, such studies attempt to explain the origins, structure, dynamics, and consequences of genuine slave societies.
1. Comparatie Approaches to Slaeholding Societies There is now a vast and growing body of literature on slavery (Patterson 1977a, Miller 1993, Miller and 14152
Holloran 2000). Yet, with the notable exception of certain anthropological studies (to be considered shortly) relatively few recent works are truly comparative in that they aim to arrive at general conclusions about the incidence, nature, and dynamics of slavery. Even those few works that compare two slaveholding societies tend to be concerned more with highlighting, through contrast, the distinctive features of the societies under consideration. This highly particularistic trend marks a regrettable departure from earlier studies of slavery. The evolutionists of the nineteenth century were the first to offer explanations for the presence of slaveholding. Their basic proposition—that there was a close relationship between stages of socioeconomic development and the rise of slavery—has received no support from modern comparative studies (Pryor 1977). However, some of their less grandiose views have survived later scrutiny. One was their emphasis on warfare, and the demand for labor at certain crucial points in their scales of development as the important factors (Biot 1840, Westermarck 1908). The other was their finding that the socioeconomic role of women was critical. It was claimed, for example, that the subjection of women provided both a social and an economic model for the enslavement of men (Biot 1840, Tourmagne 1880). The most important of the early-twentieth-century theorists was H. J. Nieboer (1910), who broke with the evolutionists with his open-resource theory. His work, which is still influential, was unusual also in its reliance on statistical data. His main hypothesis was that slavery existed to a significant degree only where land or some other crucial resource existed in abundance relative to labor. In such situations, free persons cannot be induced to work for others, so they must be forced to do so. The theory has had lasting appeal, especially for economic historians, since it is both testable and consistent with marginal utility theory. It was revived by Baks (1966) and by the MIT economist, Domar (1970). However, the theory has been shown to have little empirical support from modern crosscultural data (Patterson 1977b, Pryor 1977); and Engerman (1973) has questioned seriously its theoretical consistency. In the course of criticizing Nieboer, Siegel (1945) proffered a functionalist theory which claimed that chattel slavery may be expected to occur in those societies where there is a tendency to reinforce autocratic rule by means of wealth symbols, and where the process results in a rather strongly demarcated class structure. There is little empirical support for this theory, and it verges on circularity. In many premodern hierarchical societies with wealth symbols, it was slavery which was the main cause of increased stratification. The best recent comparative work on slavery has come from historical anthropologists studying Africa. Meillassoux (1975, 1991) and his associates have
Slaery: Comparatie Aspects demonstrated ably from their studies of West Africa and the Sahel just how valuable a nondogmatic Marxian approach can be in understanding the dynamics of trade, ethnicity, status, and mode of production in complex lineage-based societies. The anthropologist Jack Goody (1980), drawing on Baks (1966), began by arguing that ‘slavery involves external as well as internal inequality, an unequal balance of power between peoples.’ From this unpromising start he enriches his analysis with both the ethnohistorical data on Africa and the cross-cultural statistical data from the Murdock Ethnographic Atlas. Central to his analysis is the role of women. The complex facts of slavery cannot be explained, he argues, ‘except by seeing the role of slaves as related to sex and reproduction as well as to farm and production, and in the case of eunuchs, to power and its non-proliferation.’ While being valuable, Goody’s arguments are confined to Africa, and even with respect to this continent they are insufficient to explain why it was that some African societies came to rely so heavily on the institution while closely related neighboring groups did not. The economist Frederic Pryor (1977) has come closest to formulating a general theory of premodern slavery using modern statistical techniques applied to cross-cultural data. He distinguishes between social and economic slavery, and argues that different factors explain the presence of one or the other. Central to his theory is the nineteenth-century idea that there is a correspondence or ‘homologism’ between male domination of women and masters’ domination of slaves. He tried to demonstrate that economic slavery was most likely to occur ‘in societies where women normally perform most of the work so that the slave and the wife act as substitutes for each other,’ whereas social slavery was related to ‘the role of the wife in a polygynous situation.’ The theory is interesting and robustly argued but is problematic. Where women dominated the labor force there was no need for men to acquire slaves for economic purposes. On the contrary, it was precisely in such situations that slaves, when used at all, were acquired for social purposes. The presumed correspondence between wives and slaves is also questionable. There were profound differences between these two roles. Wives everywhere belonged to their communities, and were intimately kin-bound, whereas slaves everywhere were deracinated and kinless. Wives always had some rights, some power, at least in the protection of their kinsmen, in relation to their husbands and could rarely be killed with impunity, which was not true of slaves. Wives everywhere had honor, while slaves had none. Far more comparative work is needed for an understanding of the institutionalization of slavery. The main variables involved in any explanation are now well known. They are: The economic and social role of women; polygyny; internal and external warfare; the mode of subsistence (mainly whether pastoral
or agricultural); and the mode of political succession. However, the causal role and direction of each of these variables is complex, and their interaction with each other even more so. To take the role of women as an example, in some cases it is their role as producers that is important; in others, their role as reproducers. It is also difficult using static cross-cultural data to ascertain whether low female participation in production is the result of slavery or its cause.
2. Approaches to the Study of Genuine Slae Societies Marxist scholars were the first to take seriously the problem of the origins, nature, and dynamics of largescale slavery (for a review, see Patterson 1977a). Engels’s (1962) periodization view that large-scale slavery constituted an inevitable stage in the development of all human societies was merely one version of nineteenth-century evolutionism, discussed earlier. It dominated East European thought until the deStalinization movement, and is still the orthodox view in mainland China. More sophisticated, but no less problematic, have been attempts of modern Marxists to formulate a ‘slave mode of production.’ The most empirically grounded of these attempts is that of Perry Anderson (1974) who argued that ‘the slave mode of production was the decisive invention of the Graeco-Roman world, which provided the ultimate basis both of its accomplishments and its eclipse.’ There is now no longer any doubt that slavery was foundational for both Athens and Rome, and that these were the first societies in which large-scale or genuine slavery emerged. However, the concept of the ‘slave mode of production’ overemphasizes the materialistic aspects of slavery, making it of limited value for the comparative study of slavery. Slaves were indeed sometimes used to generate new economic systems—as in Rome and the modern plantation economies—but there are many cases where, even when used on a large scale, there was nothing innovative or distinctive about the economic structure. The narrow materialist focus not only leads to a misunderstanding of the relationship between slavery and technology in the ancient world, but more seriously, it fails to identify major differences between the Greek and Roman cases, and it is of no value in the study of genuine slave societies in which the structurally important role of slaves was noneconomic, as was true of most of the Islamic slave systems. Several non-Marxist historical sociologists, drawing also on the experience of ancient Europe, have made important contributions to our understanding of genuine slave societies. According to Weber (1964) slavery on a large scale was possible only under three conditions: ‘(a) where it has been possible to maintain slaves very cheaply; (b) where there has been an 14153
Slaery: Comparatie Aspects opportunity for regular recruitment through a wellsupplied slave market; (c) in agricultural production on a large scale of the plantation type, or in very simple industrial processes.’ However suggestive, there is little support for these generalizations from the comparative data. Medieval Korea (Salem 1978) and the US South (Fogel and Engerman 1974) disprove (a) and (b). And work on urban slavery in the modern world as well as the relationship between slavery and technology in the ancient world disprove (c) (Finley 1973). Finley (1960, 1973, 1981) was the first scholar to grapple seriously with the problem of defining genuine slave societies, which he explicitly did not confine to those in which the slaves were economically important. He also cautioned correctly against too great a reliance on numbers in accounting for such societies. His emphasis on the slave as a deracinated outsider led him to the conclusion that what was most advantageous about slaves was their flexibility, and their potential as tools of change for the slaveholder class. Finlay also offered valuable pointers in his analyses of the relationship between slave and other forms of involuntary labor. And in criticizing Keith Hopkins’s (1978) conquest theory of the emergence of genuine slave societies, persuasively he encouraged an emphasis on demand, as opposed to supply factors in the rise of slave society. Romans, he argued, captured many thousands of slaves during the Italian and Punic wars because a demand for them already existed and ‘not the other way around.’ He postulated three conditions for the existence of this demand: private ownership of land, and some concentration of holdings; ‘a sufficient development of commodity production of markets’; and ‘the unavailability of an internal labor supply.’
3. A Framework for the Study of Slae Societies It is useful to approach the comparative study of slave society with an understanding of what structural dependence means. There are three fundamental questions. First, what was the nature of the dependence on slavery; second, what was the degree of dependence; and third, what was the direction of dependence. Answers to these three questions together determine what may be called the modes of articulation of slavery. The nature of dependence on slavery may have been primarily economic or social, political or militaristic, or a combination of these. Economic dependence was frequently the case, especially in ancient Rome and in the modern capitalistic slave systems. Here the critical questions are: What were the costs and benefits of imposing and maintaining an economy based on slave labor? Stanley Engerman (1973) has written authoritatively on this subject. He distinguishes between the costs of imposition, of enforcement, and of worker 14154
productivity. Lower maintenance costs, more constant supply of labor, greater output due to the neglect of the non-pecuniary costs of labor, higher participation rates and greater labor intensity, and economies of scale, are the main factors proposed by Engerman in explaining the shift to slave labor. They apply as much to ancient Rome as they do to the modern capitalistic slave systems of America and the Caribbean. Military and bureaucratic dependence were the main noneconomic forms, and they could be as decisive for societies as was economic dependence. The rise of Islam was made possible by the reliance on slave soldiers, and the Abbasid Caliphate, along with other Islamic states, was maintained by slave and exslave soldiers and bureaucrats. Although there was little economic dependence on slaves in most Islamic states—with the notable exception of ninth-century Iraq—many of them qualify as genuine slave societies as a result of the politico-military dependence of these regimes on slavery, and the ways in which slaves influenced the character of their cultures, many of the literati also being slaves or descendants of slaves (see Pipes 1981, Crone 1981, Marmon 1999). The degree of dependence must be taken into account, although it is sometimes difficult to quantify. There has been a tendency to emphasize the size of the slave population over other variables, and while demography is important it can sometimes be misleading. Thus, it has been noted that no more than one in three adults were slaves in Athens at the height of its slave system in the late fifth century BCE as a way of playing down the importance of slavery. But as M. I. Finley liked to point out, the same was true of most parts of the American Slave South, and no one has ever questioned the fact that it was a large-scale slave system. Finally, there is the direction of dependence. Even where a society had a high functional dependence on slavery it was not necessarily the case that slavery played an active, causal role in the development of its distinctive character. The institution, though important, was not structurally transformative. In ancient Rome, the Sokoto Caliphate and the slave societies of the Caribbean, the American South and Brazil, slavery was actiely articulated and transformative. In other cases, however, it was passiely articulated, the classic instance being Korea during the Koryo and Yi periods where, although a majority of the rural population were at times slaves, there was no significant transformation of the economy, and no impact on the regime’s government and culture. The same was true of several of the modern Spanish colonies of Central and South America. During the sixteenth and seventeenth centuries there was marked structural dependence on slavery in Mexico and Central America as well as Peru, and the urban areas of Chile and Argentina, but the institution was not determinative in the course of development of these societies and, as in Korea, when slavery ended it left hardly a trace, either
Slaery: Comparatie Aspects cultural or social (Mellafe 1975, Palmer 1976, Klein 1986, Blackburn 1997). These three factors together determined a finite set of modes of articulation, which is a sociologically more useful construct than that of the so-called ‘slave mode of production.’ Space permits only the most cursory mention of some of the most important such modes of articulation. The lineage mode of articulation refers to those kinbased societies in which large-scale slavery was related critically to the rise to dominance of certain lineages in the process of class and state formation. In some cases, slavery originally served primarily economic ends; in others, mainly social and political ones; but in the majority the institution became multifunctional. This kind of genuine slave society was most commonly found in western and west-central Africa. Warfare, combined with some critical internal factor such as demographic change accounted for the rise of slavery (see Miller 1977 for the Kongo kingdom). The ideal case of this mode of articulation was the Asante state (Wilks 1975, Klein 1981). Slaves were originally incorporated as a means of expanding the number of dependents, a tendency reinforced by the matrilineal principle of descent. The growing number of slaves enhanced the lineage heads who owned them, and facilitated the process of lineage hierarchy and state formation. Later, the role of slaves was greatly expanded to include a wide range of economic activities, including mining. The predatory circulation mode refers to those slave societies in which warfare and raiding mainly for slaves were the chief occupations of a highly predatory elite. The warrior class was usually assisted by a commercial class which traded heavily in slaves. There was usually a high rate of manumission of slaves, who not only contributed to the production of goods, but in their role as freedmen soldiers helped the ruling class to produce more slaves. Thus there was a continuous circulation of persons in and out of slave status as outsiders were incorporated as loyal freedmen retainers, creating a constant need for more enslaved outsiders. Slaves and freedmen often played key roles in, and sometimes even dominated, the palatine service and elite executive jobs. The contiguous existence of pastoral and sedentary agricultural peoples with different levels of military might was the major factor in the development of this mode of articulation of slavery. The mode is strongly associated with Islam, and there are many examples of it in the Sahel and North Africa. The west–east spread of the Fulani over an area of some 3,000 miles over a period of 800 years provides one of the best cases of this mode, on which there is an abundance of historical and anthropological data (Lovejoy 2000, Pipes 1981, Meillassoux 1975, 1991). The embedded demesne mode embraces those patrimonial systems dominated by large landed estates in which slaves were incorporated on a substantial scale
to cultivate the demesne land of the lords. Serf or tenant laborers continued, in most cases, to be the primary producers of food for the society, and their rents or appropriated surpluses remained a major source of wealth for the ruling class. However, slaves were found to be a more productive form of labor on the home farms of the lords for the cost–benefit reasons analyzed by Engerman, mentioned above. The landowners got the best of both types of labor exploitation. This was a particularly valuable arrangement where a landed aristocracy needed to change to a new crop but was not prepared to contest the technical conservatism of serfs; where there was a supply of cheap labor across the border; or where there was a high level of internal absenteeism among the landed aristocracy. This is the least understood or recognized form of advanced slave systems, perhaps because its mode of articulation was usually passive. Many of them were to be found in medieval Europe, especially in France, Spain, and parts of Scandinavia (Dockes 1982, Verlinden 1955, 1977, Bonassie 1985, Anderson 1974, Patterson 1991). The slave systems of the western ‘states’ of eleventh-century England, where slave populations were as high as 20 percent of the total are likely examples. So, possibly, was Viking Iceland, which may well have had similar proportions of slaves (Williams 1937, Foote and Wilson 1970). However, the ideal case of embedded demesne slavery was to be found in Korea, especially during the Koryo and early Yi periods. Here the slave population sometimes exceeded that of other forms of bonded labor (Salem 1978, Wagner 1974, Hong 1979). The urban–industrial modes of articulation were those in which the urban elites came to rely heavily on slaves for support. Slaves played a relatively minor role in agriculture, although they may well have dominated the ‘home farms’ of certain segments of the ruling urban elites. Slave labor was concentrated in urban craft industries which produced goods for local consumption and exports, as well as the mining sector where it existed. Slavery emerged on a large scale in such systems as a result of a combination of factors among which were the changing nature and frequency of warfare, conquest of foreign lands, the changing tastes of the ruling class, crises in the internal supply of labor, shifts in food staples, growing commercial links with the outside world, and demographic changes. This mode of articulation could be either passive or active. To the extent that the character of the urban civilization depended on its urban economy, and to the extent that the economy depended on slave laborers, both manual and technical, to that degree were these systems active in their articulation. The classic case of the active mode of slave articulation were the ancient Greek slave systems, especially Athens of the fifth and fourth centuries BC (Finley 1981, Garlan 1982, De Ste. Croix 1981). Typical of the 14155
Slaery: Comparatie Aspects passive mode of urban–industrial articulation were several of the Spanish slave systems of Central and South America during the sixteenth and seventeenth centuries, mentioned above. The Roman or urban–latifundic mode: ancient Roman slavery stands in a class by itself, having no real parallels in either the ancient, medieval, or modern worlds. It came the closest to a system of total slavery in world history. It was distinctive, first, in the sheer magnitude of its imperial power and the degree of dependence on slavery, both at the imperial center and in its major colonial sectors. Second, Rome was unique in the extent of its reliance on slavery in both its rural and urban–industrial sectors. Third, the articulation of slavery was more actively transformative than in any other system, entailing what Hopkins (1978) called an ‘extrusion’ of free, small farmers and their replacement by slaves organized in gangs on large latifundi. Rome was unusual too, not only for its high levels of manumission, but for the extent to which slaves and slavery came to influence all aspects of its culture. The capitalist plantation mode: contrary to the views of early economic theorists such as Adam Smith, and of Marxist scholars until fairly recently (Genovese 1965), modern plantation slavery was in no way incompatible with capitalism. Indeed, the rise of capitalism was intimately bound up with this mode of articulation of slavery, and at its height in nineteenthcentury America was as profitable as the most advanced industrial factories of Europe or the northern United States (Fogel and Engerman 1974). Plantation slavery constituted one version of the worldwide systemic spread of capitalism, in which capital accumulation was advanced through the use of slaves in the peripheral colonial regions, complementing the use of so-called free labor in the metropolitan centers, and of serf and other forms of dependent labor in the semiperipheral areas of Eastern Europe and Latin America (Wallerstein 1974, Blackburn 1997). While plantation slavery bore some organizational resemblance to the ancient slave latifundia, and indeed can be traced historically to late medieval and early modern variants of the ancient model (see Solow 1991), it was distinctive in its complex, transnational system of financial support, in its production for international export, its heavy reliance on a single crop, its advanced organizational structure, which in the case of sugar involved innovative agri-industrial farms, in the vast distances from which slaves were imported, entailing a complex transoceanic slavetrading system, and in its reliance on slaves from one major geographic region who differed sharply from the slaveholding class in ethnosomatic terms. However, it is important to recognize that there existed concurrently with the capitalistic plantation mode several other modes of slave articulation that were either precapitalist or at best protocapitalist. As has already been noted, many of the Spanish colonial 14156
systems in the Americas relied on forms of slavery that were nonaccumulative and distinctly premodern in their articulation, for example the urban–industrial modes of Mexico and parts of South America which were focused heavily on mining. In the Spanish Caribbean up to the third quarter of the eighteenth century a peculiar premodern form of agri-pastoral slavery prevailed which had little in common with the highly capitalistic slave plantation systems of the neighboring French and British islands, and which had to be dismantled at considerable political and socioeconomic cost in order to make way for the belated capitalistic slave systems that were imposed forcefully during the nineteenth century (Knight 1970). While slavery was largely abolished in the Americas through the course of the nineteenth century (Blackburn 1988, Davis 1975), pockets of the institution persist into the twenty-first century in northern Africa and parts of Asia (Stearman 1999). With the possible exception of Mauritania, however, in none of these societies do we find genuine slave systems. It can be claimed, with cautious optimism, that this most inhuman form of societal organization has vanished from the world.
Bibliography Anderson P 1974 Passages from Antiquity to Feudalism. London Ayalon D 1951 L’Esclaage du Mamelouk. Jerusalem Baks C J et al 1966 Slavery as a system of production in tribal societies. Bijdrgen tot de Taal-, Land- en Volkenkunde 122 Biot E 1840 De l’abolition de l’esclaage ancien en Occident. Paris Blackburn R 1988 The Oerthrow of Colonial Slaery, 1776– 1848. London Blackburn R 1997 The Making of New World Slaery. London Bonassie P 1985 Survie et extinction due regime esclavagiste dans l’Occident du haut moyet age. Cahiers de Ciilisation Medieale 28 Crone P 1981 Slaes on Horses. New York Davis D B 1966 The Problem of Slaery in Western Culture. Ithaca, NY Davis D B 1975 The Problem of Slaery in the Age of Reolution, 1770–1823. Ithaca, NY De Ste Croix G E M 1981 The Class Struggles in the Ancient Greek World. Ithaca, NY Dockes P 1982 Medieal Slaery and Liberation. Chicago Domar E 1970 The causes of slavery or serfdom. Journal of Economic History 30 Engels F 1962 Anti-Duhring. Moscow Engerman S 1973 Some considerations relating to property rights in man. Journal of Economic History 33 Finley M I (ed.) 1960 Slaery in Classical Antiquity. Cambridge Finley M I 1968 Slavery. International Encyclopedia of the Social Sciences, Vol. 14 Finley M I 1973 The Ancient Economy. London Finley M I 1981 Economy and Society in Ancient Greece. In: Shaw B D, Saller R P (eds.) , New York Fogel R W, Engerman S 1974 Time On the Cross. Boston, Vols. 1 and 2
Slaes\Slaery, History of Foote P, Wilson D 1970 The Viking Achieement. London Garlan Y 1982 Les esclaes en grece ancienne. Paris Genovese E D 1965 The Political Economy of Slaery. New York Goody J 1980 Slavery in time and space. In: Watson J L (ed.) Asian and African Systems of Slaery. Oxford Hong S H 1979 The legal status of the private slaves in the Koryo Dynasty (in Korean). Han’gukakpo 12 Hopkins K 1978 Conquerors and Slaes. Cambridge Klein A N 1981 The two Asantes. In: Lovejoy P (ed.) The Ideology of Slaery in Africa. Beverly Hills Klein H S 1986 African Slaery in Latin America and the Caribbean. New York Knight F W 1970 Slae Society in Cuba During the 19th Century. Madison, WI Lovejoy P E 2000 Transformations in Slaery: A History of Slaery in Africa. New York Marmon S E (ed.) 1999 Slaery in the Islamic Middle East. Princeton, NJ Meillassoux C (ed.) 1975 L’esclaage en Afrique precoloniale. Paris Meillassoux C 1991 The Anthropology of Slaery: The Womb of Iron and Gold. (Trans Dasnois A). London Mellafe R 1975 Negro Slaery in Latin America. Berkeley, CA Miller J C 1977 Imbangala lineage slavery. In: Miers S, Kopytoff I (eds.) Slaery in Africa. Madison, WI Miller J C (ed.) 1993 Slaery and Slaing in World History: A Bibliography. Millwood Miller J C, Holloran J R (eds.) 2000 Slavery: Annual bibliographical supplement. Slaery and Abolition 21(3) Nieboer H J 1910 Slaery as an Industrial System. Rotterdam, The Netherlands Palmer C 1976 Slaes of the White God: Blacks in Mexico, 1570–1650. Cambridge, MA Patterson O 1977a Slavery. Annual Reiew of Sociology 3: 407–49 Patterson O 1977b The structural origins of slavery: A critique of the Nieboer–Domar hypothesis. Annals of the New York Academy of Science 292: 12–34 Patterson O 1982 Slaery and Social Death. Cambridge, MA Patterson O 1991 Freedom in the Making of Western Culture. New York Pipes D 1981 Slae Soldiers and Islam. New Haven, CT Pryor F L 1977 A comparative study of slave societies. Journal of Comparatie Economics 1 Salem H 1978 Slaery in Medieal Korea. Ph.D. dissertation, Columbia University Siegel B J 1945 Some methodological considerations for a comparative study of slavery. American Anthropologist 7 Solow B (ed.) 1991 Slaery and the Rise of the Atlantic System. Cambridge, MA Stearman K 1999 Slaery Today. Hove Tourmagne A 1880 Histoire de l’esclaage encien et moderne. Paris Verlinden C L’Esclaage dans l’Europe Medieale. Vols. 1 and 2 (Bruges, 1995; Ghent, 1977) Wagner E 1974 Social stratification in 17th century Korea. Occasional Papers on Korea 1 Wallerstein I 1974 The Modern World System. New York Weber M 1964 The Theory of Social and Economic Organization. New York Westermarck E 1908 The Origins and Deelopment of the Moral Ideas. New York
Wilks I 1975 Asante in the Nineteenth Century. Cambridge Williams C O 1937 Thralldom in Ancient Iceland. Chicago, IL
O. Patterson
Slaves/Slavery, History of Slavery comes in many culturally specific forms. Common to all these different forms is the fact that slaves are denied the most basic rights. Slaves remain outside the social contract that binds the other members of a given society; they are people without honor and kin, subject to violent domination. They are objects of the law, not its subjects, considered as property, or chattel. Hence, we may describe slavery with Orlando Patterson as a form of ‘social death,’ and slaves as outsiders kept in a position of institutionalized marginality. In many languages, the term for foreigner also denoted slaves. Nowadays slavery is considered a most inhumane practice. The United Nations Universal Declaration of Human Rights proclaims: ‘No one shall be held in slavery or servitude; slavery and the slave trade shall be prohibited in all their forms.’ Signing the slavery conventions of 1926 and 1956, 122 states have committed themselves to abolish slavery and similar practices such as debt bondage, serfdom, servile marriage, and child labor. In historical times, however, slavery as an institution was accepted in many societies with a minimum of social stratification all over the world, from early Mesopotamia and Pharaonic Egypt to ancient China, Korea, India, medieval Germany, and Muscovite Russia, from the Hebrews to the Aztecs and Cherokees. Yet, according to Moses Finley there were only five full-fledged slave societies with slaves constituting the majority of people and their work determining the whole economy. All five were located in what we tend to call the West, two in antiquity: Athenian Greece and Rome between the second century BC and the fourth century AD, three in modern times: Brazil, the Caribbean, and the southern US. The Ottoman Empire was notorious for its slave soldiers, the janissary, taken from its Slavic neighbors to the north and from Africa. The most exploitative form of chattel slavery emerged in the plantation economies of the New World under the aegis of European colonial rule with the UK playing a dominant role. Interestingly, this happened at the time when the medieval concept of labor as a common community resource was progressively replaced by a free labor regime in the UK. Ultimately, and for the first time in history, employer and employed in Europe were equal before the law, while slavery with its extreme status differential prevailed in the new world. The ensuing contradictions eventually led to the rise of an abolitionist movement in Europe and to the progressive banning of slavery in the nineteenth and twentieth centuries. 14157
Slaes\Slaery, History of The last to make slavery illegal were the governments on the Arabian Peninsula. They did so in 1962. Where there are slaves, there are people arguing about slavery, some rationalizing it, others denouncing it, and yet others to better demarcate the duties and rights of those involved. There is an extensive Islamic literature about slavery, paying particular attention to the question of who may, and who may not, be enslaved, to the duties of slave owners and the terms of manumission. Many classical and Christian authors dealt with the same problems. The abolitionist movement generated a literature of its own, stressing the horrors of slavery as in ‘Oronooko,’ the justly famous play by Aphra Behn, and the life narrative of Olaudah Equiano, who was taken captive as a child in his native Igbo village in what is now Nigeria. A common aspect of all these contributions written by contemporaries is their normative approach, based either in religion or moral philosophy, their focus on institutional and legal matters, and not the least on their reliance on anecdotal evidence and personal experience. In ancient Greece and in Rome, slaves were perceived as captives of war even when bought and defined as things until the Code of Justinian defined slaves as persons. Christian and Islamic thought highlighted religious difference. Slavery was primarily the fate of nonbelievers taken captive in ‘just’ wars. Yet, neither Christianity nor Islam prevented the enslavement of coreligionists, and if conversion obliged Islamic slave owners to manumit their slaves, it did not change the slaves’ status in Christian societies. Later authors rather emphasized property relations. The same line of argument informs the 1926 League of Nations slavery convention which defines slavery as the ‘status or condition of a person over whom any or all the powers attaching to the right of ownership are exercised.’ European and American scholars pioneered the historical study of slavery. Taking their cues from Greece, Rome, and medieval European experiences, they tended to interpret slavery as an intermediate stage in the general process of historical development. G. W. F. Hegel construed slavery as a sort of development aid for the benefit of Africans. Classical social sciences from Adam Smith to Max Weber identified slavery with backwardness. Marxist thinkers were even more explicit, positing primitive society, slavery, feudalism, capitalism, and communism as five consecutive stages of world history. Thus, slavery and capitalism, slavery and markets, and slavery and technological innovations were considered utterly incompatible. How then can one explain the rise of racial slavery in the plantation economies of the New World? From about 1500 to the late nineteenth century Portuguese, British, Dutch, French, Danish, Spanish, and American vessels carried some 12 million plus Africans across the Atlantic to be used mainly as field hands for the production of sugar, coffee, tobacco, rice, indigo, and cotton. Others were put to work in the gold and silver mines of Spanish America. Together 14158
they constituted by far the largest forced migration the world has known. Moreover, how should one interpret slavery in the South of the US, the first modern nation, if it is in opposition to capitalist logic? Studies of US and West Indian slavery have dominated the field of slave studies from its very beginning. They set the agenda for research into other domains, such as: the transatlantic slave trade and its precedents in the Mediterranean and the Black Sea during the Middle Ages and the Renaissance. In the Italian city-states slaves were common until the fourteenth century, on the Iberian Peninsula up to the sixteenth century, many of them Moslems. Christian sailors, captured on the seas, ended up in North African slavery. Later, researchers shifted their attention to plantation slavery in other areas of the New World, and finally to slavery in various non-Western societies. In the beginning, scholars studying Southern slavery followed an institutional approach, describing plantation slavery as a closed system of generalized violence and totalitarian control with dehumanizing effects similar to Nazi concentration camps. Slaves in these pioneering interpretations were victims, devoid of any power and will of their own. All the power was said to be vested in the slave owners who treated their slaves as mere chattel; whipping them into submission and keeping them at starvation level, even working them to death; denying them any family life and breeding them like cattle, buying and selling the slaves at their will and pleasure. Then came Eugene D. Genovese’s monumental historical anthropology of plantation life in the American South. A Marxist himself, Genovese held on to the idea that the planters constituted a pre-capitalist class. Yet he showed that the slaves, although people without rights or hardly any rights, were not people without will. Quite to the contrary, they were even able to shape the structures of the system as a whole. Using cunning and other weapons of the weak in the daily transactions at the work place, they moved their owners to accept their humanity and some minimal duties. The outcome was a paternalist system with some give and take between slave-owner and slave. With this new paradigm slaves got their agency back. Slavery as such came to be seen as a negotiated relationship and as an intrinsically unstable process, the terms of which varied with time and place. This process spanned different stages from enslavement and transporation to life in slavery and possible, yet by no means inevitable social re-integration. There were different ways to overcome the outsider status: resistance and flight, manumission or the purchase of freedom. Prompted by the American civil rights movement, slave lives, slave culture, and slave resistance developed into privileged subjects of study. Herbert G. Gutman showed that slaves against all odds were living in families, creating enduring networks of kinship and self help. Religion proved to be
Slaes\Slaery, History of another major sphere where slaves gained some cultural autonomy. Integrating Christian teachings and African beliefs they developed a richly textured religious life with dance and song as integral parts of worship. As to resistance, scholars had to contend with the fact that apart from the Haitian Revolution, slave rebels were nowhere able to end slavery. Studies of slave rebellions nonetheless brought to the fore numerous instances of organised resistance. A more salient aspect of these studies pioneered by John Hope Franklin was the insight into the complexities of resistance which ranged from mocking tales and mimicking dances to the deliberate cultivation of African customs, to go slow tactics and flight and the establishment of so-called Maroon societies. These societies developed into refuges for other slaves. Most of them had only a limited life span, some, however, such as the seventeenth century kingdom of Palmares in the hinterland of Pernambuco evolved into autonomous centers of power at the margins of the colonial system, often cultivating their own forms of slavery. Most successful were the rebels in relatively weak colonial states with open frontiers such as Brazil, Spanish America, and Surinam where the Saramaka fought a successful war of liberation against the Dutch, lasting more than 100 years. A major advance in the study of slavery was achieved when historians shifted their focus from the interpretation of anecdotal literary evidence to the analysis of statistical material and other mass data with the help of economic modeling techniques. Philip D. Curtin was the first to attempt a census of the transatlantic slave trade using customs records from the Americas. His calculations have later been complemented and refined, yet his general insights withstood any later scrutiny and still hold true at the beginning of the twenty-first century. Others tried to extend his census to the much longer established Trans-Saharan, Indian Ocean, and Red Sea trades although there is no comparable database for these regions. Current estimates stand at 12 million plus Africans transported along these routes from the seventh to the twentieth century. Others reconstructed slave prices at the points of purchase and sale, carefully tracing changes over time, and studying all the other aspects of slave trading and the plantation business. These endeavors engendered seminal new insights. Scholars were able to prove that the length of a voyage was the single most important factor determining mortality during the voyage across the sea, the ghastly Middle Passage, with death rates ranging from 10 to 20 percent and more. They showed that pricing was competitive and that supply was responsive to changes in price and that the transatlantic slave trade continued well into the nineteenth century. It ended when Brazil and Cuba emancipated their slaves in the 1880s, the last to do so in the Western hemisphere. Scholars also discovered that markets shifted eastwards and southwards along
the West African coast, from Senegambia and the Upper Guinea coast to the Bight of Biafra and northwestern Angola. Imports into Africa varied over time and from place to place with cotton goods, brandy, guns, and iron tools in high demand everywhere along the coast. With the recent publication of more than 27,000 Atlantic slave trading voyages in one single data set, even non-specialists have access to parts of the evidence used in the econometric computations. Taken together, the studies of the new economic history show that the Marxist thesis of an inherent contradiction between slavery and capitalism is untenable is untenable as both slave trading and plantation economies were capitalist to the core, with prices set by the laws of supply and demand. In a major reinterpretation of slavery in the antebellum South Robert William Fogel and Stanley L. Engerman also convincingly demonstrated that during the reign of ‘King Cotton,’ the planters acted as capitalist entrepreneurs who were ready to innovate when it paid to do so and who rather used incentives than the whip to drive their slaves. They even ventured the thesis that in material terms (diet, clothing, housing, and life expectancy) slaves in the South were better off than many peasants and workers in Europe and the northern US. Moreover, the system was well and flourishing to the end with high levels of productivity. Parallel quantitative studies of the plantation economies in the West Indies arrived at similar conclusions: abolition came not in the wake of economic decline, but rather in spite of economic success. Seymour Drescher even coined the term econocide. A major contentious issue among scholars has been the significance of slavery for the Industrial Revolution. In 1947 Eric Williams argued that the profits from the ‘triangular trade’ had financed the Industrial Revolution in the UK, adding that abolition was a consequence of economic decline. This doublepronged ‘Williams thesis’ became a central tenet of dependency theory. A close look at hundreds and thousands of slave voyages, however, showed that average profits were smaller than scholars had previously assumed and could never explain why the UK achieved the stage of self-sustained growth in the latter half of the eighteenth century. New data on capital formation in that period points in the same direction. Nevertheless, slavery and the plantation economies contributed to British economic growth and the advent of modernity. Slavery reduced the cost of sugar and tobacco for European consumers and it made great fortunes; it created markets for cheap industrial massproducts, and sugar plantations with their gang-labor system may even be interpreted as ‘factories in the fields’ with a work discipline prefiguring that of modern industrial plants. The impact of slavery was also felt in the realm of ideas and ideology. It furthered a racially informed negative perception of all things African in the West and set the stage for racism, a most 14159
Slaes\Slaery, History of tragic legacy of more than 400 years of slave trading, that the West has not yet come to terms with. The rise of racial slavery in the Atlantic system is a reminder of the fact that economic self-interest and markets, when left to themselves, can produce the most immoral and humanly destructive institutions. The term ‘racial slavery’ refers to the fact that almost all slaves employed in the Americas hailed from Africa as the Spanish had banned the enslavement of Amerindians already by the 1540s. The impact of slavery on Africa is also very controversially debated. Yet, once again a change of perspective has led to a shift of emphasis. An earlier generation of scholars with first-hand experience of colonial domination saw African societies mainly as victims of external forces. They singled out the transatlantic slave trade as a first step towards underdevelopment as it depopulated whole areas, or so ran the argument, and induced African rulers to engage in wars for the sake of making prisoners for sale in exchange for guns, which then triggered new wars. This set of ideas informed dependency theory as well as world-system analysis and the early nationalist African historiography. Walter Rodney even argued that African slavery on the Guinea coast was a result of external demand. Quantitative studies shattered some of these assumptions. The gun-slave-cycle could not be substantiated. In addition, historical demography, although a risky business, points to demographic stagnation from 1750 to 1850, with population losses in the areas directly affected by slaving when other continents were beginning to experience demographic growth. The causal link between the slave trade and Africa’s economic backwardness as posited by Joseph E. Inikori is a contested issue. A better understanding of the complexities of African politics has furthermore led scholars to argue that most African wars of the period were independent of the external demand for slaves, although the slave trade generated income, strengthening some polities while weakening others. Asante, Dahomey, and the Niger Delta city-states were among the main beneficiaries while the kingdom of Benin was one of those who abstained from participating in the slave trade. Slave sales shifted with the vagaries of African politics, not to forget the cycles of drought with the ensuing threats of starvation, which also led people into slavery. In other cases, slavery was a punishment for the crimes people had committed. To stay in business, European and American slave traders had to adapt to local circumstances. They were at the mercy of local merchants and local power holders. They hardly ever ventured beyond the ports of call and the forts they built along the West African coast. They also bought women and children, although the planters considered them second choice to men, and people from other areas than those that the planters preferred. Women constituted up to a third of 14160
those sent to the Americas. The female ratio among slaves was higher than among early European immigrants. It ensured an enduring African physical presence. It is also a further proof that African cultural and political parameters had the power to shape the Atlantic system, an argument best developed by David Eltis. Studies of African slavery have furthered the understanding of these cultural parameters. Slavery was widespread in precolonial Africa and expanded in the nineteenth century despite, or rather because of, the abolition of the transatlantic slave trade by the UK in 1807. Some scholars even argue that West African societies then developed into fully-fledged slave societies with 50 percent and more of their population kept in bondage. First among these were the Sudanic societies from the Futa Jallon to Sokoto and Adamawa, where Islamic revolutions and statebuilding initiated more intense slave raiding than ever before. Yet, African slavery, often referred to as ‘household slavery,’ markedly differed from chattel slavery. As elsewhere, slaves had to do the hardest work and their owners treated them as outsiders, the first to be sacrificed in public rituals and put to death at the whim of their owner. However, it was common to find slaves in positions of albeit borrowed authority, such as traders, officers, and court officials. Elsewhere slave and slave owner worked side by side. Alternatively, slaves lived in villages of their own. They owned property and in some cases, even slaves of their own. More importantly, the slave status was not a fixed status; rather it changed over time along a slave to kinship continuum, with people born into slavery exempt from further sale. Recent scholarship has stressed that women were the pre-eminent victims of slavery in Africa as they were valued for their labor power and for the offspring they might give birth to. Moreover, one should never forget that power and prestige in African societies rather depended on the number of followers someone could count on, than the control of land which was not a scarce resource yet, or not everywhere. According to H. J. Nieboer agricultural slavery developed wherever land was plentiful and the productivity of agricultural labour was low. The slave to kinship paradigm as expounded most forcefully by Igor Kopytoff and Suzanne Miers has lost some of its appeal when scholars discovered that in many places a slave past carries a social stigma to this day. Yet Africa is a continent with a wide spectrum of cultures. Hence, what prevails at one place may be contradicted at another. Nonetheless, the general rule of slavery getting more exploitative and more strictly closed when used for the production of export staples and in societies with strong state structures holds true for Africa as well as for the Americas. A case in point is slavery on the East African coast and the island of Zanzibar during the clove boom of the mid-nineteenth century. In West Africa too, slavery turned more adversarial when set to the production of palm oil in
Slaes\Slaery, History of the aftermath of the abolition of the transatlantic slave trade, called the period of legitimate trade. The less centralized power in a society was, the easier it was for slaves to overcome their outsider status. This, however, meant integration into a land-holding kin group as access to women and land, the keys to reproduction, was generally controlled by these. In modern Europe, on the other hand, property rights in human labor, one’s own or others, were vested in the individual. Hence freedom in Africa, and other non-Western societies, meant belonging, but independence in Europe. Colonialism also helped to keep the memory of slavery alive. In the early nineteenth century, the UK had done its best to end the slave trade. It even stationed a naval squadron for this purpose in West African waters and concluded anti-slave trade treaties with African rulers who had trouble seeing the benefits of abolition. The European public later tended to take the imperialist scramble for Africa as an anti-slavery crusade, but colonial administrations, while forcefully suppressing the slave trade, were reluctant to fight slavery as a social system; rather they closed their eyes and did their best to freeze existing social relations while claiming to bring progress to Africa. When slavery finally ended in Africa, it did so not so much as the result of conscious acts of emancipation but rather because of the emergence of a colonial labor market, with the socially defined older mechanisms for the integration of outsiders easing the transition. The men and women held in bondage eagerly took up alternatives wherever they were viable. The cultural turn in social sciences has deeply influenced slave studies. There is no end to quantitative analyses. However, scholars have discovered that cultural parameters are important variables affecting what people do, even when they operate under the impress of a market system. Even Homo oeconomicus has to consider cultural values. Gender and slave memories have also attracted much more attention than before. The 41 volumes of interviews with former slaves conducted in the 1920s and 1930s in the US remain unrivalled, yet cf. the splendid collection of peasant narratives from Niger presented by Jean Pierre Olivier de Sardan. Archeologists have started to investigate slave material culture. Others have begun to retrace the construction of different Creole identities and cultures in the Americas so rich in imaginative creativity. They all encounter the same problem: as slaves by definition were people without voice, what they experienced and what they thought is to a great extent buried in the documents written by the perpetrators of slavery. To give the slaves their voice back is one of the nobler tasks of historical scholarship, but it is immensely difficult. While studies of slavery become ever more detailed and localized, historians have come to realize that the Middle Passage is more a bridge than an abyss or a one-way street, binding different cultures together, a
point first made by the anthropologist Melville J. Herskovits. Slaves came with a history of their own to the Americas, returning Africans and children of former slaves such as Olaudah Equiano, Wilmot Blyden, and James Africanus Horton were among the first to consciously define themselves as Africans. Hailing from different regions of Africa, yet sharing a common plight, they developed an awareness of a common African identity. The books they wrote helped to lay the foundations for Panafricanism, a strand of ideas and a political movement central to the freedom struggle of Africans in the twentieth century. David Brion Davis noted a similar correlation between slavery and freedom in Western religious, legal, and philosophical discourses. Hence the double paradox and legacy of slavery: it inflicted death and hardship on millions and millions of people of mostly African descent for the benefits of a few; and while it was instrumental for the rise of racial prejudice, it also shaped the notion of freedom and equality as we know and cherish it today. To consider these global dimensions, is the challenge of any further research into the history of slavery even when dealing with very specific local aspects of the problem. But this is easier said than done because it requires historians to venture beyond the deeply rooted traditions of privileging the history of a particular nation-state, usually their own, into the open sea of a new comparative history. The fight against slavery is not won so long as debt bondage, servile marriage, forced labor, child labor, trafficking of women, forced prostitution, and other forms of bondage exist in many societies, including the very rich countries of the West. See also: Colonialism, Anthropology of; Colonization and Colonialism, History of; Human Rights, History of; Property, Philosophy of; Slavery as Social Institution; Slavery: Comparative Aspects; Trade and Exchange, Archaeology of
Bibliography Bales K 1999 Disposable People: New Slaery in the Global Economy. University of California Press, Berkeley Berlin I 1998 Many Thousands Gone. The First two Centuries of Slaery in North America. The Belknap Press, Cambridge, MA Blackburn R 1997 The Making of New World Slaery. From the Baroque to the Modern, 1492–1800. Verso, London Clarence-Smith W G 1989 The Economics of the Indian Ocean Slae Trade in the Nineteenth Century. Frank Cass, London Craton M 1978 Searching for the Inisible Man: Slaes and Plantation Life in Jamaica. Harvard University Press, Cambridge, MA Curtin P D 1967 African Remembered; Narraties by West Africans from the Era of the Slae Trade. University of Wisconsin Press, Madison, WI Curtin P D 1969 The Atlantic Slae Trade: A Census. University of Wisconsin Press, Madison, WI
14161
Slaes\Slaery, History of Davis D B 1966 The Problem of Slaery in Western Culture. Cornell University Press, Ithaca, NY Davis D B 1984 Slaery and Human Progress. Oxford University Press, New York Drescher S 1977 Econocide: British Slaery in the Era of Abolition. University of Pittsburgh Press, Pittsburgh, PA Drescher S, Engerman S L 1998 A Historical Guide to World Slaery. Oxford University Press, New York Elkins S M 1968 Slaery; a Problem in American Institutional and Intellectual Life, 2nd edn. University of Chicago Press, Chicago Eltis D (ed.) 1999 The Trans-Atlantic Slae Trade: a Database on CD-ROM. Cambridge University Press, Cambridge, UK Eltis D 2000 The Rise of African Slaery in the Americas. Cambridge University Press, Cambridge, UK Finley M I 1987 Classical Slaery. Frank Cass, London Fogel R W 1989 Without Consent or Contract. The Rise and Fall of American Slaery, 1st edn. Norton, New York Fogel R W, Engerman S L 1974 Time on the Cross; the Economics of American Negro Slaery. Little, Brown, Boston Franklin J H, Moss A A 2000 From Slaery to Freedom: a History of African Americans, 8th edn. A. A Knopf, New York Genovese E D 1974 Roll, Jordan, Roll: The World the Slaes Made, 1st edn. Pantheon Books, New York Gutman H G 1976 The Black Family in Slaery and Freedom, 1750–1925, 1st edn. New York Pantheon Books, New York Hellie R 1982 Slaery in Russia, 1450–1725. University of Chicago Press, Chicago Herskovits M J 1941 The Myth of the Negro Past. London Harper & Brothers, New York Inikori J E, Engerman S L (eds.) 1992 The Atlantic Slae Trade: Effects on Economies, Societies, and Peoples in Africa, the Americas, and Europe. Duke University Press, Durham, NC James C L R 1938 The Black Jacobins: Toussaint l’Ouerture and the San Domingo Reolution. The Dial Press, New York Klein H S 1986 African Slaery in Latin America and the Caribbean. Oxford University Press, New York Knight F W 1997 The Slae Societies of the Caribbean. General History of the Caribbean. UNESCO Pub, London Kopytoff S M, Miers I 1977 African slavery as an institution of marginality. In: Kopytoff S M (ed.) Slaery in Africa: Historical and Anthropological Perspecties. University of Wisconsin Press, Madison, WI Lovejoy P E 1983 Transformations in Slaery: a History of Slaery in Africa. Cambridge University Press, Cambridge, UK Manning P 1990 Slaery and African Life. Occidental, Oriental, and African Slae Trades. Cambridge University Press, Cambridge, UK Mattoso K M, de Queiro! s 1986 To be a Slae in Brazil, 1550–1888. Trans. Goldhammer A. Rutgers University Press, New Brunswick, NJ Meillassoux C 1991 The Anthropology of Slaery: The Womb of Iron and Gold. University of Chicago Press, Chicago Miller J C 1993 Slaery and Slaing in World History: A Bibliography, 1900–1991. Knaus International Publications, Millwood, NY Morrissey M 1989 Slae Women in the New World: Gender Stratification in the Caribbean. University Press of Kansas, Lawrence, KA Nieboer H J 1910 Slaery as an Industrial System; Ethnological Researches, 2nd rev. edn. M Nijhoff, The Hague, The Netherlands
14162
Olivier de Sardan J P 1976 Quand nos anceV tres eT taient captifs. ReT cits paysans du Niger. Paris Patterson O 1982 Slaery and Social Death. A Comparatie Study. Harvard University Press, Cambridge, MA Phillips U B 1918 American Negro Slaery: A Surey of the Supply, Employment and Control of Negro Labor as Determined by the Plantation Regime. D Appleton, New York Price R 1973 Maroon Societies: Rebel Slae Communities in the Americas, 1st edn. Anchor Press, Garden City, CA Raboteau A J 1978 Slae Religion: The ‘Inisible Institution’ in the Antebellum South. Oxford University Press, Oxford, UK Rawick G P, Federal Writers’ Project 1972 The American Slae: a Composite Autobiography. Contributions in Afro-American and African studies; no. 11. Greenwood, Westport, CT Reid A, Brewster J 1983 Slaery, Bondage, and Dependency in Southeast Asia. St Martin’s Press, New York Robertson C C, Klein M A 1983 Women and Slaery in Africa. University of Wisconsin Press, Madison, WI Rodney W 1982 African slavery and other forms of social oppression on the Upper Guinea Coast in the context of the African slave trade. In: Inikori J (ed.) Forced Migration. The Impact of the Export Slae Trade on African Societies. Africana , New York Schwartz S B 1985 Sugar Plantations in the Formation of Brazilian Society: Bahia, 1550–1835. Cambridge University Press, Cambridge, UK Sheriff A 1987 Slaes, Spices, and Iory in Zanzibar: Integration of an East African Commercial Empire into the World Economy, 1770–1873. J. Currey, London 1980 Slaery & Abolition. Cass, London, Vol. 1 Stampp K M 1956 The Peculiar Institution: Slaery in the AnteBellum South, 1st edn. Knopf, New York Thornton J 1992 Africa and Africans in the Making of the Atlantic world, 1400–1680. Cambridge University Press, Cambridge, UK Toledano E R 1998 Slaery and Abolition in the Ottoman Middle East. University of Washington Press, Seattle, WA Verger P F 1982 Orisha: les Dieux Yorouba en Afrique et au Noueau Monde. Me! taille! , Paris Verlinden, C 1955 L’esclaage dans l’Europe MeT dieT ale. De Tempel, Bruges, Belgium Watson J L (ed.) 1980 Asian and African Systems of Slaery. University of California Press, Berkeley, CA Williams E E 1994 Capitalism and Slaery, 1st edn. University of North California Press, Chapel Hill, NC Willis J R 1985 Slaes and Slaery in Muslim Africa. Frank Cass, London Wirz A 1984 Sklaerei und kapitalistisches Weltsystem. Suhrkamp, Frankfurt am Main, Germany
A. Wirz
Sleep and Health 1. Sleep Physiology Sleep is a ubiquitous mammalian phenomenon which is associated with diminished responsiveness to external stimuli and a familiar and delightful sense of restoration under the circumstances of a normal night of sleep. The importance of sleep in daily living can
Sleep and Health
Figure 1 The distribution of sleep stages over the course of a normal night of sleep
easily be discerned when an individual has even a modest restriction in a normal night of sleep. This has been well documented to be associated with performance decrement and alterations in mood. The mysteries of sleep are many, and how it produces its restorative effects remain vague, but it is clear that it is at the core of mental and physical well being. Biological functioning cannot be thoroughly appreciated without integrating sleep and sleep biology into our understanding of waking biology. It is obvious now to the layman and professional sleep researcher both that sleeping behavior effects waking behavior, and waking behavior effects sleeping behavior. In the mid-1950s, some of the mysteries of sleep began to unravel by a series of discoveries by Nathaniel Kleitman, William Dement, and Eugene Aserinsky at the University of Chicago (Dement 1990). These individuals produced a series of studies which delineated the brain wave patterns associated with different stages of sleep, and the unique stage of sleep which was termed rapid eye movement (REM) sleep, shown to be associated with dreaming. These stages of sleep were determined by the concomitant recording of the electroencephalogram (EEG), electro-octulogram (EOG), and the electromyogram (EMG). The stages of sleep are characterized by a pattern of activity in each of these variables. Brain wave activity changes dramatically from waking to the deep stages of sleep (stages 3 and 4) with high voltage slow waves becoming increasingly dominant. These stages are distributed throughout the night in a well-defined pattern of activity associated with REM periods approximately every 90 minutes (Fig. 1). REM periods are interspersed with stages 2, 3, and 4 such that stages 3 and 4 are identified largely in the first third of the night, with relatively little noted in the last third of the night. REM sleep is accumulated with episodes occurring every 90 minutes as noted, but as the night progresses, the REM periods become progressively longer, thus producing very long REM periods in the last third of the sleeping interval. Thus, REM sleep is identified primarily in the last third of the night, with relatively
little in the first third of the night (Carskadon and Dement 1994). Of interest, is the fact that REM sleep is associated with some unique physiologic changes. Thermoregulatory and respiratory functioning are markedly compromised during REM sleep. For example, the vasomotor mechanisms necessary to respond appropriately to increases and decreases in ambient temperature in order to maintain a normal core body temperature are suspended during REM sleep. This renders the mammalian organism poikilothermic, or cold blooded, during REM sleep. In addition, appropriate respiratory compensatory responses to both hypoxemia and hypercapnia are substantially blunted during REM sleep. These data combine to suggest that even under normal circumstances, REM sleep is a period of physiologic risk woven into the fabric of a normal night of sleep. Other phenomena associated with REM sleep are of considerable interest. For example, it is known that REM sleep is associated with a skeletal muscle paralysis. This has been substantiated by the documentation of hyperpolarization of motor neurons during REM sleep in cats. In addition, REM sleep is generally associated with thematic dreaming, and it is felt that the inhibition of skeletal muscles prevents the ‘acting out’ of dreams. Also during each REM period penile tumescence occurs, resulting in a normal full erection in males (Carskaden and Dement 1994). Non-REM (NREM) sleep, alternatively, is associated with a slowing of physiological processes to include heart rate, blood pressure, and general metabolic rate compared to the waking state. During the first episode of slow wave sleep (stages 3 and 4) there is a dramatic increase in the secretion of growth hormone. This has been shown to be specifically linked to the first episode of slow wave sleep, rather than a circadian phenomenon. No other hormone is quite so precisely identified with secretion during a specific stage of sleep.
2. Subjectie Aspects of Sleep The quality of sleep, and the physiologic parameters which produce ‘good sleep’ or a feeling of restoration in the morning are not understood. Deprivation of REM sleep as opposed to non-REM sleep does not appear to differentially affect subsequent mood or performance. It is well established that older adults have characteristic alterations in their sleep patterns, i.e., a marked diminution in stages 3 and 4 sleep, which are quite characteristically associated with complaints of nonrestorative, or poor sleep. Determining the physiologic characteristics of the subjective elements of ‘good sleep’ is difficult in that it is well known that the subjective and physiologic aspects of sleep can be 14163
Sleep and Health easily disassociated. It is not unusual, for example, for individuals with significant complaints of insomnia or poor sleep to have essentially normal physiological sleep patterns. Sleep is well known to be easily disturbed by mood and psychological state. Sleep is clearly disturbed by anxiety and in patients with generalized anxiety disorder. Characteristically this is associated with prolonged sleep onset latency and multiple awakenings subsequent to sleep onset. In addition, a well-recognized prodrome to a clinically significant depressive episode is an alteration in sleep pattern which includes an early onset of REM sleep and early morning awakenings with difficulty falling back to sleep. There are also data which document the fact that REM deprivation in depressed patients can produce a significant antidepressant effect (see Depression). Good sleep is commonly associated with good health and a sense of well being. Measures of overall functional status have been known to be significantly correlated with both subjective and objective measures of daytime sleepiness. Other studies have shown that sleep disordered breathing is associated with lower general health status, with appropriate controls for body mass index, age, smoking status, and a history of cardiovascular conditions. Even very mild degrees of sleep disordered breathing have been shown to be associated with subjective decrements in measures of health status which are comparable to those individuals with chronic disease such as diabetes, arthritis, and hypertension. Complaints of poor sleep and\or insomnia, and daytime sleepiness and fatigue are common. A recent Gallop survey in the USA indicated that approximately one third of Americans have insomnia. Thirtysix percent of American adults indicated some type of sleep problem. Approximately 27 percent reported occasional insomnia, while 9 percent indicated that their sleep-related difficulty occurs on a regular, chronic basis. In a study of a large sample of Australian workers, the prevalence of a significant elevation in the Epworth Sleepiness Scale was just under 11 percent. These data were not related significantly to age, sex, obesity, or the use of hypnotic drugs. Perhaps the most common cause of daytime fatigue and sleepiness, aside from self-imposed sleep restriction, is obstructive sleep apnea. The sine qua non of this sleep disorder is persistent sonorous snoring. Snoring itself is the most obvious manifestation of an increase in upper airway resistance, and as it progresses to more significant levels, the upper airway gradually diminishes in cross-sectional diameter and may produce a complete occlusion. Sonorous snoring can exist as a purely social nuisance or it can be associated with multiple episodes of partial and complete upper airway obstruction during sleep associated with dangerously low levels of oxygen saturation. This sleep-related breathing disorder is also associated with complaints of daytime fatigue and 14164
sleepiness that may be only minimally obvious to some individuals, but can be quite severe and debilitating in others (Orr 1997).
3. Sleep Disorders There are a variety of documented sleep disorders which have been described in a diagnostic manual which can be obtained through the American Academy of Sleep Medicine. This manual describes the diagnostic criteria of a plethora of sleep disorders ranging from disorders which manifest themselves primarily as symptoms, i.e., insomnia, narcolepsy, to physiologic disorders which can be defined only through a polygraphic sleep study such as sleep apnea, periodic limb movements during sleep, and nocturnal gastroesophageal reflux. The prevalence of these disorders varies from narcolepsy (5–7 cases per 10,000) to OSAS (2–4 cases per 100) to insomnia which may be as high as 25 percent of the general population who report occasional problems with this disorder. Space constraints do not permit a discussion of all of these disorders, therefore we will touch only on the most commonly encountered sleep disorders. Clearly, sleep disorders have well-documented consequences with regard not only to health but to daytime functioning. Perhaps the most common of all relates to the behavioral consequences of sleep fragmentation or sleep restriction. Either as a result of anxiety, or a physiologic sleep disturbance, sleep restriction and sleep fragmentation can produce documentable declines in performance, and increases in daytime sleepiness. The effects of even minimal sleep restrictions are cumulative across nights, but the effects can be quickly reversed in a single night of normal sleep. Sleep restriction is commonly noted in shift workers; it is estimated that approximately 30 percent of the American work force works rotating shifts, or a permanent night shift. Even permanent night shift workers rarely obtain ‘normal’ sleep in the daytime, and complaints of inadequate or nonrestorative sleep are persistent in this group of workers. Studies have shown that accidents are 20 percent higher on the night shift, and 20 percent of night shift workers report falling asleep on the job. The demands of our increasingly complex technologic society have created a workforce that is under greater stress, obtaining less sleep, and clearly increasing prevalence of sleep complaints and sleep disorders. An epidemiological study has estimated the prevalence of sleep disordered breathing (defined as five obstructive events per hour or greater) was 9 percent for women and 24 percent for men (Young et al. 1993). It was estimated that 2 percent of women and 4 percent of middle-aged men meet the minimal diagnostic criteria for sleep apnea, which includes five obstructive events per hour with the concomitant symptom of daytime sleepiness. Although, obstructive
Sleep and Health sleep apnea syndrome (OSAS) is felt to be a predominantly male phenomenon, the incidence rises sharply among postmenopausal women. Persistent, loud snoring, and the concomitant upper airway obstruction, has been shown to carry with it a variety of medical and behavioral risks. Various studies have shown that snoring is a significant predictor of hypertension, angina, and cerebral vascular accidents independent from other known risk factors. One study has shown in a multiple regression analysis that snoring was the only independent risk factor which differentiated stroke occurring during sleep and stroke occurring at other times of the day. Other studies have documented a higher mortality rate in patients with moderate to severe OSAS compared to those with a less severe manifestation of this disorder. Furthermore, other studies have shown that in individuals with curtailed life expectancy secondary to OSAS, the most common cause of death was myocardial infarction. Other studies have confirmed that more aggressive treatment of OSAS reduces the incidence of vascular mortality (Hla et al. 1994). The issue of the relationship between snoring, OSAS, and hypertension is somewhat controversial. The frequent association of significant OSAS with male gender, and obesity make it difficult to determine the relative contribution of each variable to hypertension. One large population study of 836 males from a general medical practice in England revealed a significant correlation between overnight hypoxemia and systemic blood pressure, but this could not be determined to be independent of age, obesity, and alcohol consumption (Stradling and Crosby 1990). No significant relationship was found with snoring. Alternatively, another excellent study which actually utilized overnight polysomnography and 24-hour ambulatory blood pressure monitoring did find an association between hypertension and sleep apnea which is independent of obesity, age, and sex in a nonselected community based adult population (Hla et al. 1994). Clinically, it is recognized that appropriate treatment of OSAS will often result in a notable reduction in blood pressure, often independent of weight loss. In a recent National Institute of Health supported project on sleep and cardiovascular functioning, preliminary data have shown that 22 to 48 percent of hypertensive patients have been observed with significant OSA, and 50 to 90 percent of sleep apnea patients have been documented to have hypertension. Perhaps the most well-known ‘sleeping disorder’ or ‘sleep sickness’ is narcolepsy. Long assumed to be synonymous with uncontrollable daytime sleepiness, narcolepsy is now known to be a neurological disorder, and genetic abnormalities have recently been described in a canine model of narcolepsy. This disorder is associated with an extraordinary degree of daytime sleepiness but that alone does not define the syndrome. Many individuals with OSAS also have an equivalent degree of daytime sleepiness. Narcolepsy is identified
by other ancillary symptoms. Perhaps the most common and dramatic is cataplexy in which individuals have a partial or complete loss of skeletal muscle control in the face of emotional stimuli such as fear, anger, or laughter. In the sleep laboratory, narcolepsy is associated with a unique pattern of sleep where an onset of REM sleep is noted usually within 10 to 15 minutes after sleep onset. This is considered to be a pathognomonic sign of narcolepsy. Unfortunately, there is no cure for narcolepsy at the present time, and treatment is limited to the symptomatic improvement of daytime sleepiness with stimulant medication, and the control of the ancillary symptoms primarily via tricyclic antidepressant medication. The results of the Gallop survey indicated that 36 percent of American adults suffer from some type of sleep problem. Approximately 25 percent reported occasional insomnia, while 9 percent said that their sleep-related difficulty occurs on a regular or chronic basis. In addition, the survey documented that insomniacs were 2.5 times more likely than noninsomniacs to report vehicle accidents in which fatigue was a factor. In addition, compared to noninsomniacs, insomniac patients reported a significantly impaired ability to concentrate during the day. The welldocumented excessive daytime sleepiness in patients with obstructive sleep apnea has been documented in several studies to result in a significant increase in traffic accidents, as well as at-fault accidents (Findley et al. 1989). The extreme sleepiness noted in patients with OSAS is almost certainly the result of an extreme degree of sleep fragmentation secondary to arousals from sleep by repeated events of obstructed breathing. Furthermore, it is well established that the sleepiness experienced by patients with OSAS is completely reversible by appropriate therapy which resolves the sleep related breathing disorder. The most common approach to therapy currently is the application of a positive nasal airway pressure via a technique referred to as continuous positive airway pressure (CPAP). The effects of sleep deprivation, either as a result of willful sleep restriction, poor sleep secondary to insomnia, or fragmented sleep secondary to OSAS, has substantial effects on physical and behavioral health as noted above. Studies on chronic sleep deprivation have shown minimal physiologic effects, but have documented dose related effects in performance decrements and decreased alertness. More recently, however, an extraordinary set of studies has demonstrated that long-term sleep deprivation does have very significant effects in rats, and can ultimately result in death. The most likely explanation of death appears to be the result of a profound loss of thermal regulation (Rechtschaffen 1998). Also of interest, are recent studies which suggest alterations in immune function secondary to chronic sleep deprivation in humans. In conclusion, the intuitive notion that sleep has important consequences with regard to one’s health 14165
Sleep and Health appears to be clearly documented. Alterations in sleep producing a fragmentation of sleep or restriction in the normal duration of sleep have been shown to have significant consequences with regard to waking behavior. In addition, sleep disorders which fragment sleep such as obstructive sleep apnea produce effects that relate not only to waking behavior, but also have significant consequences with regard to both cardiovascular and cerebral vascular complications as well as producing a significant increase in mortality. Perhaps, the most important message to be gleaned from the remarkable increase in our knowledge of sleep is that sleeping behavior affects waking behavior and waking behavior affects sleeping behavior. See also: Stress and Health Research
Bibliography Carskadon M, Dement W 1994 Normal human sleep: An overview. In: Kryger M, Roth T, Dement W (eds.) Principles and Practice of Sleep Medicine. Saunders, Philadelphia, PA Dement W C 1990 A personal history of sleep disorders medicine. Journal of Clinical Neurophysiology 7: 17–47 Findley L J, Fabrizio M, Thommi G, Suratt P M 1989 Severity of sleep apnea and automobile crashes. New England Journal of Medicine 320: 868–9 Hla K M, Young T B, Bidwell T, et al 1994 Sleep apnea and hypertension. Annals of Internal Medicine 120: 382–8 Orr W 1997 Obstructive sleep apnea: Natural history and varieties of the clinical presentation. In: Pressman M, Orr W (eds.) Understanding Sleep. American Psychological Association, Washington, DC Rechtschaffen A 1998 Current perspectives on the function of sleep. Perspecties in Biology and Medicine 41: 359–90 Stradling J R, Crosby J H 1990 Relation between systemic hypertension and sleep hypoxaemia or snoring: Analysis in 748 men drawn from general practice. British Medical Journal 300: 75 Young T, Palta M, Dempsey J, et al 1993 The occurrence of sleep-disordered breathing among middle-aged adults. The New England Journal of Medicine 328: 1230–5
W. C. Orr
Sleep Disorders: Psychiatric Aspects Complaints of too little sleep (insomnia) or too much sleep (hypersomnia), or of sleep that is not restorative enough, are termed dyssomnias. This chapter is intended to give an overview on the frequency and risk factors of dyssomnias, and on the psychiatric conditions that may be associated with them, and the physiopathological concepts related to sleep changes in depression. Of the greater than 30 percent of the general population who complain of insomnia during 14166
the course of one year, about 17 percent report that it is ‘serious.’ Insomnia is encountered more commonly in women than men and its prevalence increases with age. Insomnia represents one of the major features of depression and appears to be one type of the prodromal symptoms and risk factors for the development of major depression. Since chronic dyssomnia most often occurs as a comorbid disturbance of psychiatric and physical conditions, a thorough evaluation of the patient and his\her sleep complaints is needed to lay the foundation for accurate diagnosis and effective treatment. The diagnosis of insomnia is based upon the subjective complaint of sleeping too little. Patients report difficulties in initiating or maintaining sleep or of non-restorative sleep, i.e., not feeling well rested after sleep that is apparently adequate in amount, and tiredness during the day. Insomnia may occur as primary or as secondary disorder due to other psychiatric conditions, due to general medical conditions, and\or due to substance misuse. Compared to secondary insomnia, the prevalence of primary insomnia is relatively small. Hypersomniaincludescomplaintsofexcessivesleepiness characterized either by prolonged sleep episodes and\or excessive sleepiness during the day. These symptoms may interfere with social, occupational, and\or other areas of functioning. Like insomnia, hypersomnia may occur as a primary disorder or as a secondary disorder due to psychiatric and\or medical conditions, and\or due to substance misuse. Especially among patients encountered in psychiatric practices and hospitals, secondary sleep disturbances are more common than primary sleep disturbances. This is particularly important to bear in mind when evaluating a patient with sleep complaints. Dyssomnia is particularly often associated with psychiatric disorders such as depression, schizophrenia, anxiety disorders, or personality disorders, and with misuse of drugs and\or alcohol. Whenever possible, it is clinically useful to assess which is the primary and which is the secondary disturbance. This will facilitate clinical treatment and management and improve preventive measures.
1. Frequency and Risk Factors Somnipathies are among the most frequent complaints a general practitioner must deal with. According to epidemiological studies, 19–46 percent of the population report sleep problems. Of these, 13 percent suffer from moderate or severe disturbances. In terms of the narrow definition of diagnostic criteria, that is, initial insomnia, interrupted sleep and disturbance of daily well being, 1.3 percent of the population suffers from somnipathies. Risk factors for somnipathies are psychological stress or psychiatric illness. A high prevalence of
Sleep Disorders: Psychiatric Aspects
Figure 1 Sleep stages according to Rechtschaffen and Kales (1968)
somnipathies was reported by volunteers suffering from stress, tension, solitude, or depression. More severe sleep problems were found to be clearly related to psychiatric illness such as depression and anxiety disorders as well as substance misuse. A recent World Health Organization (WHO) collaborative study in 15 different countries found that insomnia is more common in females than males and increases with age. In fact, it is thought that half of the population over 65 years of age suffers from chronic sleep disturbances. The WHO study also found that 51 percent of people with an insomnia complaint had a well defined International Classification of Diseases-10 (ICD-10) mental disorder (mainly depression, anxiety, alcohol problems). Although many somnipathies develop intermittently, 70–80 percent of the surveyed subjects had been suffering from sleeping problems for more than one year. Disturbances of concentration and memory, difficulties in getting on with daily activities, and depressive mood changes were among the problems most often related to insomnia. The pressure imposed by their ailment leads many patients to seek help in alcohol and drugs. Insomnia has also costly financial consequences. The economical impact of insomnia can be divided
into direct and indirect costs. Direct costs of insomnia include outpatient visits, sleep-recordings, and medications directly devoted to insomnia. There is too little knowledge about the exact figures. However, the direct costs of insomnia in the US has been estimated in 1990 to be $10.9 billion (with $1.1 billion devoted to substances used to promote sleep and $9.8 billion associated with nursing home care for elderly subjects with sleep problems). The direct costs related to the evaluation of sleep disorders by practitioners seem to be a small part of the total costs of insomnia. The indirect costs of insomnia include the presumed secondary consequences of insomnia such as health and professional problems, and accidents. The exact quantification of these costs is, however, controversial. It is often not known if sleep disorders are the cause or the consequence of various medical or psychiatric diseases. For instance, it has been observed that insomniacs report more medical problems than good sleepers do and have twice as many doctor visits per year than good sleepers do. Furthermore, subjects with severe insomnia appear to be hospitalized about twice as often as good sleepers. It has also been observed that insomniacs consume more medication for various problems than good sleepers do. 14167
Sleep Disorders: Psychiatric Aspects These results confirm previous observations showing that insomnia is statistically linked with a worse health status than individuals with good sleep. Again, it can not be established whether insomnia is the cause or the result of this worse status. For instance, one could reasonably hypothesize that insomnia promotes fatigue that could increase the risk of some diseases, or more simply decrease the threshold of others that could more easily develop.
2. Regulation of Normal Sleep The quantity and quality of sleep is measured with polysomnographic recordings (PSG) that include the electroencephalogram (EEG), electromyogram (EMG), and electrooculogram (EOG). Human sleep consists of two major states, the Rapid-Eye-Movement (REM) sleep, and the nonREM (nonREM) sleep (Fig. 1). NonREM sleep is characterized by increasing EEG synchronization and is divided into stages 1–4. Stages 3 and 4 are also termed slow wave sleep (SWS) or delta sleep (‘delta waves’). REM sleep has also been termed paradoxical sleep because of its wake-like desynchronized EEG pattern combined with an increased arousal threshold. Typical nocturnal sleep is characterized by 3-5 cycles of non-REM and REM sleep phases of about 60–100 minutes duration. At the beginning of the night, the cycles contain more nonREM sleep, particularly more SWS. Towards the end of the night, the amounts of REM sleep increases and the amount of SWS decreases. The following standard terms are used to describe quality and quantity of sleep: sleep continuity refers to the balance between sleep and wakefulness (i.e., initiating and maintaining sleep), and sleep architecture refers to the amount, distribution, and sequencing of specific sleep stages. Measures of sleep continuity include sleep latency (usually defined as the duration between lights out and the first occurrence of stage 2 sleep), intermittent wake after sleep onset, and sleep efficiency (ratio of the time spent asleep to total time spent in bed). REM latency refers to the elapsed time between sleep onset and the first occurrence of REM sleep. The amount of each sleep stage is quantified by its percentage of the total sleep time. REM density is a measure of eye movements during REM sleep; typically this is low in early REM sleep periods and increases in intensity with successive REM sleep periods. In normal sleep, the cycle of non-REM and REM sleep lasts approximately 60–100 minutes. Sleep usually begins with the nonREM sleep stage 1 and progresses to stage 4 before the appearance of the first REM period. The duration of the REM sleep episodes and REM density usually increases throughout subsequent sleep cycles. As shown in Fig. 2, the pattern of nocturnal growth hormone secretion is associated with the development of SWS during the first non-REM sleep 14168
Figure 2 Bidirectional interaction between sleep EEG and nocturnal hormone secretion
period. Cortisol secretion, on the other hand, appears to be associated with increased amount of REM sleep. Several non-REM and REM sleep measures including the amount of SWS, REM latency and density may be particularly altered in affective and schizophrenic disorders, in aging, and with the administration of certain drugs. At least three major processes are involved in the regulation of normal sleep. According to the TwoProcess-Model of sleep regulation, sleep and wakefulness are influenced by both a homeostatic and a circadian process that interact in a complex way. The homeostatic Process S (related to SWS) increases with the duration of wakefulness prior to sleep onset and augments sleep propensity. The circadian Process C, driven by the internal clock located in the suprachiasmatic nuclei (SCN), describes the daily cycle of sleepiness and wakefulness and is also related to REM sleep. The third process is the ultradian rhythm of non-REM and REM sleep. The electrophysiological measures are associated with endocrinological and other physiological events. Wakefulness, non-REM sleep, and REM sleep appear to be controlled by interacting neuronal networks, rather than by unique necessary and suffic-
Sleep Disorders: Psychiatric Aspects
Figure 3 Neuroanatomy of sleep–wake regulation
ient centers. A simplified neuroanatomy of sleep– wakefulness is shown in Fig. 3. The non-REM–REM sleep cycle is regulated within the brainstem. According to current concepts, REM sleep is initiated and maintained by cholinergic neurons originating within the lateral dorsal tegmental and pedunculopontine nuclei in the dorsal tegmentum, and is inhibited by noradrenergic and serotonergic neurons originating in the locus coeruleus and dorsal raphe nuclei. Human pharmacological data are consistent with the neurophysiological concepts of the control of REM and non-REM sleep. In contrast to the brain-activated state of REM sleep, non-REM sleep is characterized by synchronized, rhythmic inhibitory–excitatory potentials in large numbers of neurons in cortical and thalamic regions. Other areas implicated in the control of non-REM sleep include cholinergic and noncholinergic neurons in the basal forebrain, the hypothalamus, and the area of the solitary tract. In addition to neuronal mechanisms in the control of sleep, more than 30 different endogenous substances have been reported to be somnogenic. These include delta sleep-inducing peptide, prostaglandin D , vaso# active intestinal peptide, adenosine, growth hormonereleasing hormone, and several cytokines including interleukins, tumor necrosis factor-α, and interferons. The significance of endogenous sleep factors in normal
sleep physiology remains to be proven, but the possibility has interesting implications. As the Two Process Model of sleep regulation suggests, increased sleep propensity with progressive duration of wakefulness might be associated with the accumulation of a sleep factor and the homeostatic function of sleep or at least delta sleep might reflect its ‘catabolism.’
3. Sleep and Psychiatric Disorders Polygraph sleep research helped paving the way to the modern age of scientific, biological psychiatry. Much of the early sleep research beginning in the mid-1960s was descriptive mapping of events taking place during sleep and laid out the currently rich picture of polysomnographic features in the different psychiatric disorders. Although many of the objective sleep abnormalities associated with psychiatric disorders appear not to be diagnostically specific, these studies have also established sleep measures as neurobiological windows into the underlying pathophysiology associated with psychiatric illnesses. Psychiatric disorders are among the most common primary causes of secondary sleep complaints, particularly of insomnia. Sleep abnormalities may be caused by central nervous system abnormalities associated with psychiatric illnesses as well as by accompanying 14169
Sleep Disorders: Psychiatric Aspects
Table 1 Typical EEG sleep findings in psychiatric disorders (meta-analysis by Benca et al. 1992)
behavioral disturbances. Patients with depression, anxiety disorders, misuse of alcohol or drugs, schizophrenia and personality disorders may complain of difficulty falling asleep, staying asleep, or of inadequate sleep. Although specific sleep patterns are not necessarily diagnostic of particular psychiatric disorders, there are relationships between certain sleep abnormalities and categories of psychiatric disorders (Table 1). A review of the literature on sleep in psychiatric disorders showed that EEG sleep in patients with affective disorders differed most frequently from those of normal control subjects.
4. Sleep Abnormalities in Depression Sleep abnormalities in patients with major depressive disorders, as assessed by laboratory studies, can be classified as difficulties of sleep continuity, abnormal sleep architecture, and disruptions in the timing of REM sleep. Sleep initiation and maintenance difficulties include prolonged sleep latency (sleep onset insomnia), intermittent wakefulness and sleep fragmentation during the night, early morning awakenings with an inability to return to sleep, reduced sleep efficiency, and decreased total sleep time. With regard to sleep architecture, abnormalities have been reported in the amounts and distribution of nonREM sleep stages across the night and include increased shallow stage 1 sleep and reductions in the amount of deep, slow-wave (stages 3 j 4) sleep. REM sleep disturbances in depression include a short REM latency 14170
( 65 minutes), a prolonged first REM sleep period, and an increased total REM sleep time, particularly in the first half of the night. Sleep disturbances are generally more prevalent in depressed inpatients, whereas only 40–60 percent of outpatients show sleep abnormalities. Moreover, a recent meta-analysis indicated that no single polysomnographic variable might reliably distinguish depressed patients from healthy control subjects or from patients with other psychiatric disorders. Table 1 gives an overview of the typical EEG sleep changes in major psychiatric conditions. This prompted some researchers to conclude that clusters or combinations of sleep variables better describe the nature of sleep disturbances in depression. Although there is some disagreement as to which specific sleep EEG variables best characterize depressed patients, the importance of sleep to depression is clear. Persistent sleep disturbance is associated with significant risk of both relapse and recurrence, and increased risk of suicide. Sleep variables such as REM latency has also been shown to predict treatment response and clinical course of illness in at least some studies. It has also recently been suggested that the nature of the sleep disturbance at initial clinical presentation may be relevant to the choice of antidepressant medications and the likelihood of experiencing treatment-emergent side effects. One of the most sensitive parameter for discrimination of patients with major depression from patients with other psychiatric disorders and healthy subjects is REM density which is substantially elevated only in depressed patients. The persistence of a depressionlike sleep pattern in fully remitted depressed patients suggests that the pattern is a trait characteristic of the sleep measurements. However, in the past, subjects have undergone investigations only after the onset of the disorder, and therefore the altered sleep pattern may merely represent a biological scar. The answer to the question ‘trait or scar’ lies in the investigation of potential patients before the onset of the disorder. The EEG sleep patterns of subjects without a personal history but with a strong family history of an affective disorder differed from those of the controls without any personal history of family history of psychiatric disorders showing a depression-like sleep pattern with diminished SWS and increased REM density. Followup studies will determine whether the sleep pattern indeed represents a trait marker indicating vulnerability. The importance of sleep in depression is also shown in other ways. Many well-documented studies show that total and partial sleep deprivation or selective REM sleep deprivation has antidepressant effects. Additionally, following total or partial sleep deprivation, patients with depression appear to be uniquely susceptible to clinical mood changes when they return to sleep. Patients who have shown a clinically significant antidepressant response to sleep deprivation are
Sleep Disorders: Psychiatric Aspects at risk of awakening depressed again, even after very short naps (see Depression; Antidepressant Drugs).
5. Neurobiology of Sleep: Releance to Psychiatric Disorders Inspired by the growing knowledge of the underlying neurobiology of sleep, investigators proposed theories into the pathophysiology of psychiatric disorders. Because depression has been studied more than any other psychiatric syndrome in recent decades, the models have attempted to explain features of sleep in depression, such as short REM latency, decreased delta sleep, and the antidepressant effects of sleep deprivation in depressed patients. Among the most prominent models of sleep changes in depression are the cholinergic–aminergic imbalance hypothesis for depression, the Two-Process-Model of sleep regulation, the Phase Advance Hypothesis, the Overarousal hypothesis, and the REM sleep hypothesis. The ‘cholinergic–aminergic imbalance hypothesis’ for depression postulates that depression arises from an increased ratio of cholinergic to aminergic neurotransmission in critical central synapses. Because various features of the sleep of depressed patients (decreased REM latency and delta sleep) have been simulated in normal volunteers by pharmacological probes, the reciprocal interaction hypothesis from basic sleep research and the cholinergic–aminergic imbalance model from clinical psychiatry have been correlated. The reciprocal interaction model assumes that the cycling alternating pattern of non-REM and REM sleep is under the control of noradrenergic\ serotonergic and cholinergic neuronal networks. Linking these concepts it is suggested if depression results from diminished noradrenergic and serotonergic neurotransmission, cholinergic activity would be expected to increase and lead therefore to the sleep disturbances of depression. In recent years the role of serotonin in the regulation of sleep has afforded increased attention. Among other neurotransmitters, serotonin plays a role in the pathophysiology of depression and its treatment; chiefly all antidepressants ultimately lead to an enhancement of serotonergic neurotransmission that is believed to be associated with clinical improvement. Depression is associated with a disinhibition of REM sleep (shortened REM latency, increased REM density) and serotonin leads to a suppression of REM sleep. Furthermore, there is evidence that the antidepressant effect of sleep deprivation is related to a modification of serotonergic neurotransmission, thus sleep regulation and depression share common physiopathological mechanisms at the serotonergic level. According to the ‘Two-Process-Model of sleep regulation,’ depression is thought to result from, or is associated with, a deficiency of Process S. This model
was supported by evidence that EEG power density in the delta frequencies was decreased during sleep in depressed patients compared with controls and by the fact that sleep deprivation increases both Process S and mood. In the ‘Extended Two-Process-Model of sleep regulation’ the interaction of hormones with EEG features has been integrated. Preclinical investigations and studies in young and elderly normal controls and in patients with depression demonstrate that neuropeptides play a key role in sleep regulation. As an example, growth hormone releasing hormone (GHRH) is a common stimulus of SWS and growth hormone release, whereas corticotropin-releasing hormone (CRH) exerts opposite effects. It is suggested that an imbalance of these peptides in favor of CRH contribute to changes in sleep EEG and endocrine activity during depression. Based on the findings that the hypothalamic–pituitary–adrenocortical (HPA) axis is disregulated in depression, and that CRH produces EEG sleep changes reminding to depression on one hand (SWS reduction, REM sleep disinhibition), and that the somatotrophic system reciprocally interacts with the HPA system (decreased GHRH and SWS), it has been postulated that deficient Process S is associated with deficient GHRH and an overdrive of CRH. Although abnormally high values of cortisol secretory activity normalize after recovery from depression, growth hormone release and several characteristic disturbances of the sleep EEG may remain unchanged. These putative trait-dependent alterations suggest that strategies aiming at restoration of these sleep changes are worthy of exploration for their potential antidepressant effect. The depressed state appears not only to be associated with a disturbance of sleep-wake homeostasis but also of circadian function or of their interaction. The ‘Phase Advance Hypothesis’ suggests that the phase position of the underlying circadian oscillator is ‘phase advanced’ relative to the external clock time. This is supported by studies showing that short REM latency could be simulated in normal controls by appropriate phase shifts of hours in bed. As mentioned above, sleep deprivation has potent antidepressant effects in more than half of all depressed patients. This observation has prompted the hypothesis that depressed patients are ‘overaroused.’ Later, the ‘Overarousal Hypothesis’ has been forwarded and was supported by psychological selfrating studies suggesting that clinical improvement to sleep deprivation in depression may be associated simultaneously with subjective feelings of more energy (arousal), less tension, and more calmness (dearousal). Other data consistent with this hypothesis include the short, shallow, and fragmented sleep patterns, lowered arousal thresholds, and elevated nocturnal core body temperature often seen in depressed patients. This hypothesis has been tested by means of studies of localized cerebral glucose metabolism with ")F-deoxyglucose positron emission tom14171
Sleep Disorders: Psychiatric Aspects ography in separate studies in depression, sleep deprivation, and the first non-REM period of the night. In these studies it was found that elevated brain metabolism prior to sleep deprivation predicted clinical benefits in depressed patients and that normalization of these measures is associated with clinical improvement. It was also found that local cerebral glucose metabolism in cingulate and amygdala at baseline was significantly higher in clinical responders than in nonresponders or normal controls. Furthermore, it was shown that glucose metabolic rate was increased during the first non-REM period in depressed patients compared with normal controls. Moreover, these studies demonstrated significant ‘hypofrontality,’ that is; reduced ratio of frontal to occipital activity compared with normal controls. The ‘REM sleep hypothesis’ of depression has been based on the findings that REM sleep is enhanced in depression and that virtually all antidepressant drugs suppress REM sleep. Early, however not replicated, studies have shown that a sustained remission from depression may be achieved by selective REM sleep deprivation carried out repeatedly every night during about two weeks. This treatment modality was not followed because the long-term REM sleep deprivation was too exhausting for the patients. Recently, researchers have proposed a treatment modality that combined some of the above hypotheses. The so-called ‘Sleep-Phase-Advance’ protocol, that is scheduling sleeping times in a way that minimizes the occurrence of REM sleep, has been shown to produce a sustained antidepressant effect. However, also this treatment modality is demanding for both the patients and the institutional staff.
6. Future Perspecties We have outlined just some aspects of sleep research that are relevant to depression. Linking basic and clinical approaches has been one of the ‘royal roads’ to the neurobiological underpinnings of psychiatric diseases and their treatment in the past, and to narrow the gap between bench and bedside. One of the most impressing links between sleep and depression is the fact that sleep deprivation alleviates depressive symptoms within hours and that sleep may restore the initial symptoms within hours or even minutes. Given the fact that the core symptoms of only few psychiatric and medical conditions may be ‘switched’ on and off, one promising lead of future research in depression is the quest to understanding the mechanisms underlying the neurobiological processes associated with sleep and wakefulness. In the future it is hoped that the rapidly evolving progress in basic neuroscience including recent molecular biology with systematic screening of gene expression and functional brain imaging techniques will help us in progressively uncovering the mystery of sleep, and ultimately improving treatment strategies of depression. 14172
Bibliography Benca R M, Obermayer W H, Thisted R A, Gillin J C 1992 Sleep and psychiatric disorders: A meta-analysis. Arch. Gen. Psychiatry 49: 651–68 Borbe! ly A A, Achermann P 1999 Sleep homeostasis and models of sleep regulation. Journal of Biological Rhythms 14(6): 557–68 Gillin J C, Seifritz E, Zoltoski R, Salin-Pascual R J 2000 Basic science of sleep. In: Sadock B J, Sadock V A (eds.) Kaplan & Sadock’s Comprehensie Textbook of Psychiatry—VII. Lippincott Williams and Wilkins, Philadelphia, pp. 199–208 Holsboer-Trachsler E, Kocher R 1996 Somnipathies: new recommendations for their diagnosis and treatment. Drugs of Today 32(6): 477–82 Kupfer D J 1999 Pathophysiology and management of insomnia during depression. Annals of Clinical Psychiatry 11(4): 267–76 Rechtschaffen A, Kales A 1968 A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects. Department of Health, Education and Welfare, Neurological Information Network, Bethesda Silva J, Chase M, Sartorius N, Roth T 1996 Special report from a symposium held by the World Health Organization and the World Federation of Sleep Research Societies: an overview of insomnia’s and related disorders—recognition, epidemiology, and rational management. Sleep 19(5): 412–6 Steiger A, Holsboer F 1997 Neuropeptides and human sleep. Sleep 20: 1038–52 Tononi G, Cirelli C 1999 The frontiers of sleep. Trends in Neuroscience 22(10): 417–8
E. Holsboer-Trachsler and E. Seifritz
Sleep Disorders: Psychological Aspects 1. Definition Good sleep is usually defined by its consequences. It is the amount and quality of sleep that results in the ability to feel and function well the next day. In contrast, a sleep disorder is a disturbance in the quantity or quality of sleep that interferes with waking performance and\or feelings of well-being. Psychological factors play a role in both the etiology and maintenance of many of the 84 sleep disorders recognized in the The International Classification of Sleep Disorders Diagnostic Coding Manual (1997). These in turn have both short-term and long-term effects on the psychological functioning of patients.
2. Background Historically, sleep has not been an area of concern in psychology. Aside from psychoanalytic theory, which placed a good deal of emphasis on the role of unconscious motivation in explaining behavior, most psychological theory was based on behavior that was objectively observable. With the discovery in the early 1950s of rapid eye movement (REM) sleep, and its
Sleep Disorders: Psychological Aspects close association with the distinctive mental activity, dreaming, a good deal of work was undertaken to explore whether and how this interacted with waking behavior. The question of whether dreaming has some unique psychological function was driven by the observation that this phenomenon was universal, regularly recurring, and persistent when experimentally suppressed. Initially it was speculated that the study of dreaming might lead to better understanding of the psychoses. Was the hallucination of mental illness a misplaced dream? Was the high degree of brain activation in REM sleep conducive to memory consolidation of new learning? Did dreaming play a role in emotional adaptation? When, after some 25 years of experimentation, no clear consequences for waking behavior could be attributed to the suppression of REM sleep, interest in this area faded. Loss of REM sleep did not appear to interfere with memory consolidation, nor did it promote waking hallucinations. Its role in emotional adaptation remained speculative. Foulkes’s (1982) study of children’s dream development concluded that dream construction was not something special. It could be explained by the periodic internal activation of the brain stimulating bits of sensory memory material represented at the same level of cognitive sophistication as the child was capable of achieving in waking thought. Adult dreams were further diminished in importance by the Activation-Synthesis hypothesis of Hobson and McCarley (1977), whose work pinpointed the area of the brain where REM sleep turns on as, not where the higher mental processes take place, but in the lower brain stem. In this theory dreams only acquire meaning after the fact, by association to what are initially, unplanned, random sensory stimuli.
3. Current Research Interest in the 24-hour mind, the interaction of waking–sleeping–waking behavior came back into focus with the development of the field of sleep disorder medicine. The impact of disordered sleep on psychological functioning is documented most convincingly through large epidemiological studies of insomnia and hypersomnia that involved extensive follow up. The findings of Ford and Kamerow (1989) that both insomnia and hypersomnia are significant risk factors for new psychiatric disorders have now been replicated in several large studies. Further work has established that the onset of a new episode of waking major depression can be predicted from the presence of two weeks of persistent insomnia (Perlis et al. 1997). These findings have sparked the development of psychological interventions for the control of insomnia in an effort to prevent the development of these psychiatric disorders. Some of these are behavioral programs that work directly to manipulate sleep
to improve its continuity, others are psychotherapeutic in nature such as interpersonal therapy and cognitive behavioral therapy which address the interaction patterns, emotional and cognitive styles that are dysfunctional in depressed persons. Again, it was the observation that morning mood on awakening in major depression was frequently low, that suggested an investigation of the REM sleep and dreaming of the depressed person to test whether defects in this system prevented overnight mood regulation or emotional adaptation. The work of Kupfer and his colleagues (Reynolds and Kupfer 1987) established that there are several REM sleep deviations associated with major depression not seen in normal individuals. This leads to a variety of manipulations in an attempt to correct these. Vogel’s (Vogel et al. 1975) studies of extensive REM deprivation interspersed with recovery nights were most successful in improving waking mood and in bringing about remission without further treatment. Cartwright’s work (Cartwright et al. 1998) on the effect on morning mood of various within-night dream affect patterns, showed that a ‘working through’ pattern, from dreams expressing negative mood to those with predominantly positive affect, predicted later remission from depression. The recurring nightmares characteristic of posttraumatic stress disorder (PTSD) has stimulated renewed interest in addressing dreams directly in treatment programs, especially for those with long-lasting symptoms of this disorder. These methods typically involve active rehearsal of mastery strategies for the control of the disturbing dream scenario. A good deal of effort is now being addressed to discovering those attitudes, beliefs, and habitual behaviors that are implicated in maintaining poor sleep patterns since the correction of these may help to restore normal sleep and abort the development of a major psychiatric disorder. Studies of the psychological profiles of those exhibiting various sleep disorders have most often employed the Minnesota Multiphasic Personality Inventory (MMPI). These show insomnia patients have more scale elevations above the norms especially of those indicating neurotic personality characteristic, such as phobias and obsessive compulsive disorders, than do matched controls. Generally, insomnia patients appear to internalize tensions rather than express them and may often somaticize these and express them as pain syndromes. In terms of demographic variables most studies report more women than men complain of insomnia. Rates are higher in those who are separated or widowed rather than single or married, and among the unemployed. In addition, insomnia is more common in those of middle to lower socioeconomic status than in the highest class. This picture suggests that a loss of the time structure of work and of a love relationship may be precipitating psychological factors in disrupting sleep\wake rhythms. 14173
Sleep Disorders: Psychological Aspects Other sleep disorders, such as periodic limb movements of sleep, and sleep-related breathing disorders, are not related to personality, but do have an impact on relationships, due to an inability to share the bed at night, and on waking performance. Severe levels of any sleep disorder, whether insomnia or hypersomnia, limits work efficiency, cognitive clarity, and emotional stability.
4. Methodological Issues Technological problems still hamper the progress in this field. Sleep laboratory studies are expensive and in many ways unnatural. Home studies while possible are unable to make repairs or adjustments as needed. Subjects must be awakened to obtain reports of their ongoing mental activity to investigate dream content. This aborts the end of each REM period, thus truncating the dream story. Dream reports are also limited by the subject’s ability to translate his\her sensory experience during sleep into waking verbal terms. Recent studies using positive emission tomography (PET) to image the brain activity during REM sleep have established that dreaming sleep differs from waking and from non-REM in the activation of the limbic and paralimbic systems in the presence of lower activity in the dorsal–lateral frontal areas (Maquet et al. 1996, Nofzinger et al. 1997). This is interpreted as confirming that dreaming engages the emotional and drive-related memory systems in the absence of higher systems of planning and executive control. This gives new impetus to the study of dreams as a unique mental activity whose place in the 24-hour psychology may now move ahead on firmer ground.
5. Probable Future Directions Work initiated by Solms (1997) on the effect of localized brain lesions on dream characteristics suggests a new line of investigation in mapping how the brain constructs dreams. This strategy may give more power to understand some of the dream disorders currently less well understood; perhaps even the hallucinations of psychoses. See also: Sleep: Neural Systems; Sleep States and Somatomotor Activity
Bibliography American Sleep Disorders Association 1997 The International Classification of Sleep Disorders Diagnostic and Coding Manual. Rochester, MN Cartwright R, Young M, Mercer P, Bears M 1998 The role of REM sleep and dream variables in the prediction of remission from depression. Psychiatry Research 80: 249–55
14174
Ford D E, Kamerow D B 1989 Epidemiologic study of sleep disturbances and psychiatric disorders: An opportunity for prevention? Journal of the American Medical Association 262: 1479–84 Foulkes D 1982 Children’s Dreams: A Longitudinal Study. Wiley and Sons, New York Hobson J A, McCarley R W 1977 The brain as a dream state generator: An activation-synthesis hypothesis of the dream process. American Journal of Psychiatry 134: 1335–48 Maquet P, Peters J-M, Aerts J, Delfiore G, Degueldre C, Luxen A, Franck G 1996 Functional neuroanatomy of human rapideye-movement sleep and dreaming. Nature 383: 163–6 Nofzinger E A, Mintun M A, Wiseman M B, Kupfer D, Moore R 1997 Forebrain activation in REM sleep: an FDG PET study. Brain Research 770: 192–201 Perlis M L, Giles D E, Buysse D J, Tu X, Kupfer D 1997 Selfreported sleep disturbance as a prodromal symptom in recurrent depression. Journal of Affectie Disorders 42: 209–12 Reynolds C F, Kupfer D J 1987 Sleep research in affective illness: state of the art circa 1987. Sleep 10: 199–215 Solms M 1997 The Neuropsychology of Dreams: A Clinicoanatomical Study. L. Erlbaum Associates, Mahwah, NJ Vogel G, Thurmond A, Gibbsons P, Sloan K, Boyd M, Walker M 1975 REM sleep reduction effects on depressed syndromes. Archies of General Psychiatry 32: 765–7
R. D. Cartwright
Sleep States and Somatomotor Activity The drive to sleep has an awesome power over our lives. If we go without sleep or drastically reduce it, the desire to sleep quickly becomes more important than life itself. The willingness to go to extraordinary efforts in order to obtain even a little sleep demonstrates vividly that sleep is a vital and necessary behavior. In fact, we often allow ourselves to be placed in lifethreatening situations in order to satisfy the need to sleep. But what is sleep, really? All of us feel, at some basic level, that we really do understand what sleep is all about. We certainly know how it feels, and the ‘meaning’ of the word is generally accepted in ordinary conversation. But, in point of fact, no one knows what sleep really is. Once we proceed beyond a simple description of an apparently quiet state that is somehow different from wakefulness, we find that the following major questions about the nature of sleep are largely unanswered: Why do we sleep?; What is (are) the function(s) of sleep?; What are the mechanisms that initiate and sustain sleep?; and, How, when and why do we wake up? In order to understand ‘sleep,’ we must clarify the key processes and mechanisms that initiate and maintain this state. From a purely descriptive point of view, we simply need to know what is going on in the brain
Sleep States and Somatomotor Actiity (and body) during sleep. But first, we must agree on how to describe and\or define those behaviors that we intuitively understand as constituting the sleep states.
1. NREM and REM Sleep Traditionally, three physiological measures are employed for describing the states of sleep in controlled laboratory situations. They are the EEG (electroencephalogram, which reflects the electrical activity of the cerebral cortex), the EOG (electrooculogram, which is a recording of eye movements), and the EMG (electromyogram, which is an indication of the degree of muscle activity, i.e., contraction). Together, these measures are used to define and differentiate sleep in mammals, which is comprised of two very different states. These states are called nonrapid eye movement (NREM) sleep and rapid eye movement (REM) sleep: each is nearly as different from the other as both are distinct from wakefulness. Generally speaking, the EEG during NREM sleep consists of slow large-amplitude waves. Involuntary, slow, rolling, eye movements occur during the transition from drowsy wakefulness to NREM sleep; otherwise, NREM essentially lacks eye movements. Few motor events occur during NREM sleep; however, body repositioning and occasionally some motor
behavior, such as sleepwalking, talking, eating or cooking, take place during NREM sleep. In general, motor processes that occur during NREM sleep are comparable to those present during very relaxed wakefulness. The cortical EEG during REM sleep closely resembles the EEG of active wakefulness, i.e., it is of low voltage and high frequency. This is a surprising finding, considering that these two states are so dramatically different from a behavioral point of view. Bursts of rapid eye movements occur phasically during REM sleep. These eye movements are similar to those which would occur during wakefulness if one were looking at an object that went from right to left and then from left to right, rapidly and continuously, for a couple of seconds at a time. During REM sleep there is also a nonreciprocal flaccid paralysis, i.e., atonia, of the major muscle groups (with the principal exception being certain muscles used in respiration). However, at intervals that usually coincide with the phasic periods of rapid eye movements, there are brief muscle twitches that involve principally, but not exclusively, the distal muscles (e.g., the muscles of the fingers, toes, and face). REM sleep is subdivided into ‘phasic’ and ‘tonic’ periods. During phasic REM sleep periods there are brief episodes of rapid eye movements and muscle twitches as well as muscle atonia (Fig. 1). Tonic REM
Figure 1 Intracellular recording from a trigeminal jaw-closer motoneuron: correlation of membrane potential and state changes. The membrane potential increased rather abruptly at 3.5 min in conjunction with the decrease in neck muscle tone and transition from quiet (NREM) to active (REM) sleep. At 12.5 min, the membrane depolarized and the animal awakened. After the animal passed into quiet sleep again, a brief, aborted episode of active sleep occurred at 25.5 min that was accompanied by a phasic period of hyperpolarization. A minute later the animal once again entered active sleep, and the membrane potential increased. EEG trace, marginal cortex, membrane potential band pass on polygraphic record, DC to 0.1 Hz. PGO, ponto-geniculo-occipital potential (reprinted from Chase MH, Chandler SH, Nakamura Y 1980 Intracellular determination of membrane potential of trigeminal motoneurons during sleep and wakefulness. Journal of Neurophysiology 44: 349–58)
14175
Sleep States and Somatomotor Actiity periods occur when the preceding phasically occurring eye and muscle twitches are absent, but there is still persistent muscle atonia.
2. Somatomotor Actiity During REM Sleep Because there is very little motor behavior during NREM sleep, the present overview of somatomotor activity during sleep consists mainly of a description of motor events which take place during REM sleep and entails: (a) an exploration of the mechanisms that control muscle activity during this sleep state, and (b) a description of the ‘executive’ mechanisms that are responsible for these REM sleep-related patterns of motor control (Chase and Morales 1994). In order to understand the mechanisms responsible for the control of motor activity during REM sleep, it is important to first describe the changes in muscle fiber activity that occur during this state. The passage from wakefulness to NREM sleep is accompanied by a decrease in the degree of contraction of somatic muscle fibers, that is, there is a decrease in muscle tone, i.e., hypotonia. Surprisingly, during the state of REM sleep there occurs somatomotor atonia, or the complete lack of tone of many somatic muscles (Fig. 1). In order to understand how atonia of the somatic musculature during REM sleep is achieved, it is necessary to understand how the activity of muscle fibers are, in general, controlled. First of all, muscles (i.e., muscle fibers) do one thing, they contract\shorten. When muscle fibers contract, they do so because command signals, which are really trains of action potentials, travel from the cell bodies of motoneurons along their axons to activate muscle fibers by changing the permeability of the muscle fiber membrane. A great many muscle groups are constantly contracting, at least to some degree, when we maintain a specific posture or perform a movement when we are awake. Thus, tone or some degree of muscle fiber contraction is dependent on the asynchronous, sustained discharge of motoneurons and the action potentials that travel down their axons to initiate the contraction of muscle fibers. There is a gradual, but slight decline in muscle tone during NREM sleep compared with wakefulness. During REM sleep, there is such a strikingly potent suppression of muscle activity and motoneuron discharge. This results in a complete loss of muscle tone, or atonia. Thus, the key to understanding atonia during REM sleep resides in understanding the manner in which the activity of motoneurons is reduced or eliminated during this state. There are two basic mechanisms that could, theoretically, be responsible for the elimination of motoneuron discharge during REM sleep: one is the direct inhibition of motoneurons (i.e., postsynaptic inhibition) and the second is a cessation of excitatory 14176
Figure 2 High-gain intracellular recording of the membrane potential activity of a tibial motoneuron during wakefulness (A), quiet (NREM) sleep (B), and active (REM) sleep (C). Note the appearance, de noo during active sleep, of large-amplitude, repetitively occurring inhibitory postsynaptic potentials. Two representative potentials, which were aligned by their origins, are shown at higher gain and at an expanded time base (C1, 2). These potentials were photographed from the screen of a digital oscilloscope. The analog to digital conversion rate was 50 µs\min. During these recordings the membrane potential during active sleep was k67.0 mV; the antidromic action potential was 78.5 mV (reprinted from Morales FR, Boxer P, Chase MH 1987 Behavioral state-specific inhibitory postsynaptic potentials impinge on cat lumbar motoneurons during active sleep. Experimental Neurology 98: 418–35)
input to motoneurons (i.e., disfacilitation). Experimentally, these processes have been explored in the laboratory by recording, intracellularly, from individual motoneurons during REM sleep. During REM sleep, it has been found that motoneurons become hyperpolarized, and as a consequence, they become relatively unresponsive to excitatory input. Analyses of the various membrane properties of these motoneurons indicate that their lack of responsiveness is due primarily to postsynaptic inhibition (Fig. 1). Thus, it is clear that postsynaptic
Sleep States and Somatomotor Actiity
Figure 3 Action potential generation during wakefulness (A) and a rapid eye movement period of active (REM) sleep (B). Gradual membrane depolarization (bar in Ah) preceded the development of action potentials during wakefulness, whereas a strong hyperpolarizing drive was evident during a comparable period of active sleep (bar in Bh). This difference was also observed preceding the subsequent generation of each action potential during the rapid eye movement episode. Action potentials in Ah and Bh are truncated owing to the high gain of the recording. Resting membrane potentials are k58 mV in (A) and k63 mV in (B). Data are unfiltered; records were obtained from a single tibial motoneuron (reprinted from Chase MH, Morales FR 1982 Phasic changes in motoneuron membrane potential during REM periods of active sleep. Neuroscience Letters 34: 177–82)
inhibition is the primary mechanism that is responsible for atonia of the somatic musculature during this state. Postsynaptic inhibition of motoneurons during REM sleep occurs when certain neurons which are located in the brainstem liberate glycine at their point of contact with motoneurons, which is called the synapse. When glycine is liberated, motoneurons respond by generating changes in their membrane properties and their ability to discharge, which are recorded as ‘inhibitory’ postsynaptic potentials. In an effort to elucidate the bases for motoneuron inhibition (which is viewed behaviorally as atonia) during REM sleep, spontaneous postsynaptic potentials, recorded from motoneurons, were examined. A unique pattern of sleep-specific inhibitory postsynaptic potentials was found to bombard motoneurons during REM sleep; they were of very large amplitude and occurred in abundant numbers (Fig. 2). Thus, it is clear that there is a unique set of presynaptic cells that are responsible for the REM sleep-specific inhibitory postsynaptic potentials that bombard motoneurons. For brainstem motoneurons (and likely
for spinal cord motoneurons as well), these presynaptic inhibitory neurons are located in the ventromedial medulla. In summary, postsynaptic inhibition is the principal process that is responsible for the atonia of the somatic musculature during REM sleep. This postsynaptic process is dependent on the presence of REM sleepspecific inhibitory potentials, which arise because glycine (an inhibitory neurotransmitter) is released by presynaptic neurons onto the surface of postsynaptic motoneurons. All of the inhibitory phenomena that have been described in the previous sections are not only present but also enhanced during the phasic rapid eye movement periods of REM sleep. How, then, could there be twitches and jerks of the eyes and limbs during phasic rapid eye movement periods when there is also enhanced inhibition? The answer is simple: most of these periods are accompanied not only by increased motoneuron inhibition but also by potent motor excitatory drives that impinge on motoneurons. Thus, during these phasic periods of REM sleep, even though there is a suppression of the activity of motoneurons, there occurs, paradoxically, the concurrent excitation of motoneurons whose discharge results in the activation of muscle fibers (which leads to twitches and jerks, mainly of the limbs and fingers) (Fig. 3). These patterns of activation reflect descending excitatory activity emanating from different nuclei in the pons and possibly from the forebrain as well. Consequently, from time to time, and for reasons as yet unknown, during REM sleep excitatory drives overpower the enhanced inhibitory drives. When this occurs, motoneurons discharge and the muscle fibers that they innervate contract. Thus, there is an increase in excitatory drives that is actually accompanied by an increase in inhibitory drives (Fig. 3). When motoneurons discharge during the REM periods of active sleep, contraction of the muscles that they innervate is unusual because the resultant movements are abrupt, twitchy, and jerky; they are also without apparent purpose. Thus, from the perspective of motoneurons, REM sleep can be characterized as a state abundant in the availability of strikingly potent patterns of postsynaptic inhibition and, during REM periods, by new periods of excitation and also by enhanced postsynaptic inhibition.
3. The Executie Control Mechanism for Somatomotor Actiity During REM Sleep The preceding description of the neural control of somatomotor control during REM sleep can also be viewed from the perspective of the central nervous system executive mechanisms that are responsible this physiological pattern of activity. A critical structure 14177
Sleep States and Somatomotor Actiity that is involved in the generation and\or maintenance of REM sleep, and especially, the motor inhibitory aspects of this state, is the nucleus pontis oralis, which is located in the rostral portion of the pontine reticular tegmentum. This area is thought to be activated by cholinergic fibers that arise from cell bodies that are located in the laterodorsal pontine tegmentum and the pedunculo pontine tegmentum. There is evidence that a neuronal mechanism that resides within or is part of the nucleus pontis oralis is involved in the generation of wakefulness (and motor excitation) as well as REM sleep (and motor inhibition). For example, studies of the nucleus pontis oralis have shown that it is central to the phenomenon of reticular response reversal. This is a phenomenon wherein stimulation of this pontine nucleus results in an increase in somatic reflex activity during wakefulness, but remarkably, the identical stimulus during REM sleep yields potent postsynaptic inhibition of the same somatic reflexes. We have suggested the existence of a neuronal switch, based on the phenomena of reticular response reversal, that is responsible for controlling the animal’s state and somatomotor inhibitory processes during wakefulness as well as REM sleep. The existence of this switch is based on the hypothesis that wakefulness occurs when inhibitory (GABAergic) neurons in the nucleus pontis oralis suppress the activity of REM sleep neurons that are also located in, or in the vicinity of, the nucleus pontis oralis (Xi et al. 1999). Thus, when REM sleep-controlling neurons discharge in this and related areas the result is the generation of this state and its attendant patterns of somatomotor inhibition. In support of the preceding hypothesis, we have found that when the inhibitory neurotransmitter GABA is placed in the nucleus pontis oralis, prolonged periods of wakefulness and heightened motor activity are elicited in cats. Conversely, the application of bicuculline, a GABAA antagonist, results in the occurrence of episodes of REM sleep and somatomotor inhibition of very long duration. We therefore conclude that a pontine GABAergic system plays a critical role in the control of wakefulness and REM sleep. There are also recent data which demonstrates that when cells of the nucleus pontis oralis discharge, they initiate a cascade of events, the first one being the excitation of a group of premotor inhibitory neurons in the lower part of the brainstem, i.e., in the medulla (Morales et al. 1999). Fibers from these medullary neurons release glycine onto motoneurons, which results in atonia of the somatic musculature. When the motor inhibitory mechanisms of REM sleep cease to function properly, and there is still ongoing motoneuron excitation (and\or enhanced motor excitation during the phasic periods of REM sleep), a syndrome occurs in cats that is called ‘REM without atonia.’ In humans, a comparable syndrome is called ‘REM Behavior Disorder.’ When humans 14178
and cats with this syndrome enter REM sleep, they begin to twitch violently, jump about and appear to act out their dreams; their sleep is quite disrupted and they are certainly a threat both to themselves and, of course, to others. It is clear that the motor inhibition that occurs during REM sleep plays a critical role in the maintenance of this state. There is also no doubt that the simple release of glycine onto motoneurons during REM sleep, which results in atonia, i.e., a lack of muscle activity, has importance and implications of far-reaching and widespread proportions.
Bibliography Chase M H, Chandler S H, Nakamura Y 1980 Intracellular determination of membrane potential of trigeminal motoneurons during sleep and wakefulness. Journal of Neurophysiology 44: 349–58 Chase M H, Morales F R 1982 Phasic changes in motoneuron membrane potential during REM periods of active sleep. Neuroscience Letters 34: 177–82 Chase M H, Morales F R 1994 The control of motoneurons during sleep. In: Kryger M H, Roth T, Dement W C (eds.) Principles and Practice of Sleep Medicine, 2nd edn. W.B. Saunders, Philadelphia, PA, pp. 163–75 Morales F R, Boxer P, Chase M H 1987 Behavioral statespecific inhibitory postsynaptic potentials impinge on cat lumbar motoneurons during active sleep. Experimental Neurology 98: 418–35 Morales F R, Sampogna S, Yamuy J, Chase M H 1999 c-fos expression in brainstem premotor interneurons during cholinergically-induced active sleep in the cat. Journal of Neuroscience 19: 9508–18 Xi M-C, Morales F R, Chase M H 1999 Evidence that wakefulness and REM sleep are controlled by a GABAergic pontine mechanism. Journal of Neurophysiology 82: 2015–19
M. H. Chase, J. Yamuy and F. R. Morales
Sleep: Neural Systems Since we spend roughly one third of our lives asleep, it is remarkable that so little attention has been paid to the capacity of sleep to organize the social behavior of animals including humans. To grasp this point, try to imagine life without sleep: no bedrooms and no beds; no time out for the weary parent of a newborn infant; and nothing to do all through the long, cold, dark winter nights. Besides demonstrating the poverty of the sociobiology of sleep, this thought experiment serves to cast the biological details which follow in the broader terms of social adaptation. Family life, reproduction, childrearing, and even economic commerce all owe their temporal organization to the power of sleep. The dependence of these aspects of our
Sleep: Neural Systems lives on sleep also underlines its importance to deeper aspects of biological adaptation—like energy regulation—that are only beginning to be understood by modern scientists.
1. Behaioral Physiology of Sleep Sleep is a behavioral state of homeothermic vertebrate mammals defined by: (a) characteristic relaxation of posture; (b) raised sensory thresholds; and (c) distinctive electrographic signs. Sleep tends to occur at certain times of day and becomes more probable as sleeplessness is prolonged. These two organizing principles are the pillars of Alex Borbely’s two-factor model: factor one captures the temporal or ‘circadian’ aspect while factor two captures the energetic or ‘homeostatic’ aspect (see Sleep Behaior). Sleep is usually associated with a marked diminution of motor activity and with the assumption of recumbent postures. Typically the eyes close and the somatic musculature becomes hypotonic. The threshold to external stimulation increases and animals become progressively less responsive to external stimuli as sleep deepens. The differentiation of sleep from states of torpor in animals that cannot regulate core body temperature has an important phylogenetic correlation with the neural structures mediating the electrographic signs of sleep. These include the cerebral cortex and thalamus whose complex evolution underlies the distinctive electroencephalographic features of sleep in the higher vertebrate mammals. Sleep also constitutes the state of entry to and exit from hibernation in mammalian species that regulate temperature at lower levels during winter. In humans, who can report upon the subjective concomitants of these outwardly observable signs of sleep, it is now clear that mental activity undergoes a progressive and systematic reorganization throughout sleep. On first falling asleep, individuals may progressively lose awareness of the outside world and experience microhallucinations and illusions of movement of the body in space; after sleep onset, mental activity persists but is described as thoughtlike and perseverative, if it can be recalled at all upon awakening. (For a description of the subsequent changes in mental activity, see the entry on dreaming.)
2. Electro-physiological Aspects of Sleep There is a complex organization of behavioral, physiological, and psychological events within each sleep bout. To detect this organization, it is necessary to record the electroencephalogram (EEG) from the surface of the head (or directly from the cortical
structures of the brain), to record the movement of the eyes by means of the electrooculogram (EOG), and to record muscle tone by means of the electromyogram (EMG). These three electrographic parameters allow one to distinguish sleep from waking and to distinguish two distinctive and cyclically recurrent phases within sleep: NREM (non-rapid eye movement) and REM (rapid eye movement) sleep. NREM, also known as synchronized or quiet sleep, is characterized by a change in the EEG from a low-amplitude, highfrequency to a high-amplitude, low-frequency pattern (see Fig. 1a). The degree to which the EEG is progressively synchronized (that is, of high voltage and low frequency) can be subdivided into four stages in humans: In stage one, the EEG slows to the theta frequency range (4–7 cycles per second or cps) and is of low voltage ( 50 mV and arrhythmic). Stage two is characterized by the evolution of distinctive sleep spindles composed of augmenting and decrementing waves at a frequency of 12–15 cps and peak amplitudes of 100 mV. Stage three is demarcated by the addition to the spindling pattern of high voltage ( 100 mV) slow waves (1–4 cps), with no more than 50 percent of the record occupied by the latter. In stage four, the record is dominated by high-voltage (150–250 mV) slow waves (1–3 cps). At the same time that the EEG frequency is decreasing and the voltage increasing, muscle tone progressively declines and may be lost in most of the somatic musculature. Slow rolling eye movements first replace rapid saccadic eye movements of waking and then also subside, with the eyes finally assuming a divergent upward gaze (Fig. 1a). After varying amounts of time, the progressive set of changes in the EEG reverses itself and the EEG finally resumes the low-voltage, fast character previously seen in waking. Instead of waking, however, behavioral sleep persists; muscle tone, at first passively decreased, is now actively inhibited; and there arise in the electrooculogram stereotyped bursts of saccadic eye movement called rapid eye movements (the REMs, which give this sleep state the name REM sleep). Major body movements are quite likely to occur during the NREM–REM transition. The REM phase of sleep has also been called activated sleep (to signal the EEG desynchronization) and paradoxical sleep (to signal the maintenance of increased threshold to arousal in spite of the activated brain). Human consciousness is associated with the low-voltage, fast EEG activity of waking and REM sleep but its unique character in dreaming depends upon other aspects of REM neurophysiology (see Fig. 1b). In all mammals (including aquatic, arboreal, and flying species) sleep is organized in this cyclic fashion: sleep is initiated by NREM and punctuated by REM at regular intervals. Most animals compose a sleep bout out of three or more such cycles, and in mature humans the average nocturnal sleep period consists of four to five such cycles, each of 90–100 minutes 14179
Sleep: Neural Systems (a)
(b)
Figure 1 (a) Behavioral states in humans. The states of waking, NREM, and REM sleep have behavioral, polygraphic, and psychological manifestations which are depicted here. In the behavioral channel, posture shifts- detectable by time lapse photography or video- can be seen to occur during waking and in concert with phase changes of the sleep cycle. Two different mechanisms account for sleep immobility: disfacilitation (during stages I–IV of NREM sleep) and inhibition (during REM sleep). In dreams, we imagine that we move, but we do not. The sequence of these stages is schematically represented in the polygraph channel and sample tracings are also shown. Three variables are used to distinguish these states: the electromyogram (EMG) which is highest in waking, intermediate in NREM sleep, and lowest in REM sleep; the electroencephalogram (EEG) and electrooculogram (EOG) which are both activated in waking and REM sleep, and inactivated in NREM sleep. Each sample record is about 20 s long. Other subjective and objective state variables are described in the three lower channels. (b) Sample records of human sleep (Snyder and Hobson)
14180
Sleep: Neural Systems duration. After a prolonged period of wake activity (as in humans) the first cycles are characterized by a preponderance of high-voltage, slow wave activity (i.e., the NREM phase is enhanced early in the sleep bout) while the last cycles show more low-voltage, fast wave activity (i.e., the REM phase is enhanced late in the sleep bout). The period length is fixed across any and all sleep periods in a given individual but is shorter in immature and smaller animals indicating a correlation of NREM–REM cycle length with brain size.
3. Neural Mechanisms of Sleep In all animals, sleep is one of a number of circadian behavioral functions controlled by an oscillator in the suprachiasmatic nucleus of the hypothalamus. It is the interaction of the intrinsic propensity to sleep at about 24 hour (i.e., circadian) intervals with the daily cycles of extrinsic light and temperature that gives sleep its diurnal organization. The exact mechanism by which hypothalamic structures act to potentiate sleep at one or another diurnal period is unknown, but increasing evidence suggests that peptide hormones are involved in regulating this coordination. The hypothesis that sleep is hormonally regulated is an old one and is linked to the subjective impression of fatigue with its implied humoral basis. The fact that the sleep drive increases with the duration of time spent awake in both normal and sleep-deprived animals clearly indicates the homeostatic regulation function of sleep, and experiments begun by Pieron at the beginning of the twentieth century suggested that a circulating humoral factor might mediate this propensity. Subsequent efforts to test the humoral hypothesis have involved the isolation of an S-muramyl peptide as a slow wave sleep-enhancing factor found both in the cerebrospinal fluid of sleep-deprived animals and in normal human urine. It remains to be established that this so-called factor S is involved in physiological sleep mediation; this is most important since factor S is known to be a product of bacterial cell walls but is not produced by the brain. Whatever its mediating mechanism, the onset of sleep has long been linked by physiologists to the concept of deafferentation of the brain. In early articulations of the deafferentation hypothesis it was thought that the brain essentially shut down when deprived of external inputs. This idea was put forth by Frederic Bremer to explain the results of midcollicular brain stem transection after which the forebrain persistently manifested the EEG synchronized pattern of slow wave sleep. Bremer thought that the so-called cereau isoleT (or isolated forebrain) preparation was asleep because it had been deprived of its afferent input. That interpretation was upset by Bremer’s own subsequent finding that transection of the neuraxis at C-1 produced intense EEG desynchronization and
hypervigilance in the so-called encephale isoleT (or isolated brain) preparation. The possibility that active neural mediation of arousal by brain stem structures situated somewhere between the midcollicular and high spinal levels was thus suggested. The faint possibility that afferent input (via the trigeminal nerve) might account for the difference between the two preparations was eliminated by Giuseppe Moruzzi’s finding that the mediopontine pretrigeminal transected preparation was also hyperalert. Thus the notion of sleep mediation via passive deafferentation was gradually replaced by the idea of sleep onset control via the active intervention of brain stem neural structures. This new idea does not preclude a contribution from deafferentation. And indeed, they appear to be complementary processes. The clear articulation of the concept of active brain stem control of the states of sleep and waking came from the 1949 finding by Giuseppe Moruzzi and Horace Magoun that high-frequency stimulation of the mid-brain reticular formation produced immediate desynchronization of the electroencephalogram and behavioral arousal in drowsy or sleeping animals. The most effective part of the reticular activating system was its rostral pole in the diencephalon where the circadian clock is located. These data indicated that wakefulness was actively maintained by the tonic firing of neurons in an extensive reticular domain. Arousal effects could be obtained following high-frequency stimulation anywhere in the reticular formation from the medullary to the pontine to the midbrain and diencephalic levels. The activating effect was thought to be mediated by neurons in the thalamus, which in turn relayed tonic activation to the entire neocortex. This concept has been recently confirmed at the cellular level by Mircea Steriade, following up on the earlier work of Dominic Purpura which had indicated that the spindles and slow waves of NREM sleep were a function of a thalamocortical neuronal interaction that appeared whenever brain stem activation subsided. Once Moruzzi and Magoun had substantiated the concept of active neural control of the states of waking and sleep, other evidence supporting this notion promptly followed. For example, it was found by Barry Sterman and Carmine Clemente that highfrequency stimulation of the basal forebrain could produce slow wave sleep in waking animals. When the same region was lesioned by Walle Nauta, arousal was enhanced. The cellular basis of these basal forebrain effects and their possible link to both the circadian oscillator of the hypothalamus and the arousal system of the reticular activating system are currently under intense investigation. The second dramatic demonstration that sleep states were under active neural control was Michel Jouvet’s lesion and transection work suggesting a role of the pontine brain stem in the timing and triggering of the REM phase of sleep. The pontine cat preparation 14181
Sleep: Neural Systems (a)
(b)
Figure 2 (a) Schematic representation of the REM sleep generation process. A distributed network involves cells at many brain levels (left). The network is represented as comprising 3 neuronal systems (center) that mediate REM sleep electrographic phenomena (right). Postulated inhibitory connections are shown as solid circles; postulated excitatory connections as open circles. In this diagram no distinction is made between neurotransmission and neuromodulatory functions of the depicted neurons. It should be noted that the actual synaptic signs of many of the aminergic and reticular pathways remain to be demonstrated, and, in many cases, the neuronal architecture is known to be far more complex than indicated here (e.g., the thalamus and cortex). Two additive effects of the marked reduction in firing rate by aminergic neurons at REM sleep onset are postulated: disinhibition (through removal of negative restraint) and facilitation (through positive feedback). The net result is strong tonic and phasic activation of reticular and sensorimotor neurons in REM sleep. REM sleep phenomena are postulated to be mediated as follows: EEG desynchronization results from a net tonic increase in reticular, thalamocortical, and cortical neuronal firing rates. PGO waves are the result of tonic disinhibition and phasic excitation orf burst cells in the lateral pontomescencephalic tegmentum. Rapid eye movements are the consequence of phasic firing by reticular and vestibular cells; the latter (not shown) directly excite oculomotor neurons. Muscular atonia is the consequence of tonic postsynaptic inhibition of spinal anterior horn cells by the pontomedullary reticular formation. Muscle twitches occur when excitation by reticular and pyramidal tract motorneurons phasically overcomes the tonic inhibition of the anterior horn cells. Anatomical abbreviations: RN, raphe nuclei; LC, locus coeruleus; P, peribrachial region; FTG, gigantocellular tegmental field; FTC, central tegmental field; FTP, parovocellular tegmental field; FTM, magnocellular tegmental field; TC, thalamocortical; CT, cortical; PT cell, pyramidal cell; III, oculomotor; IV, trochlear; V, trigmenial motor nuclei; AHC, anterior horn cell. (Modified from Hobson et al. 1986) (b) Synaptic modifications of the original reciprocal interaction model based upon recent findings. Reported data from animal (cat and rodent) are shown as solid lines and some of the recently proposed putative dynamic relationships are shown as dotted lines. The exponential magnification of cholinergic output predicted by the original model can also occur in this model with mutually excitatory cholinergic-non-cholinergic interactions (7.) taking the place of the previously postulated, mutually excitatory cholinergic-cholinergic interactions. The additional synaptic details can be superimposed on this revised reciprocal interaction model without altering the basic effects of aminergic and cholinergic influences on the REM sleep cycle. For example: i. Excitatory cholinergicnon-cholinergic interactions utilizing ACh and the excitatory amino acid transmitters enhance firing of REM-on cells (6., 7.) while inhibitory noradrenergic (4.), serotonergic (3.) and autoreceptor cholinergic (1.) interactions suppress REM-on cells. ii. Cholinergic effects upon aminergic neurons are both excitatory (2.), as hypothesized in the original reciprocal interaction model and may also operate via presynaptic influences on noradrenergicserotonergic as well as serotonergic-serotonergic circuits (8.). iii. Inhibitory cholinergic autoreceptors (1.) could contribute to the inhibition of LDT and PPT cholinergic neurons which is also caused by noradrenergic (4.) and serotonergic (3.) inputs. iv. GABAergic influences (9., 10.) as well as other neurotransmitters such as adenosine and
14182
Sleep: Neural Systems (with ablation of all forebrain structures above the level of the pontine tegmentum) continued to show periodic abolition of muscle tone and rapid eye movements at a frequency identical to the REM periods of normal sleep. The implications were (a) that the necessary and sufficient neurons for timing and triggering by REM were in the pons and (b) that this system became free-running when the pontine generator was relieved of circadian restraint from the hypothalamus. Small lesions placed in the pontine tegmentum between the locus coeruleus and the pontine reticular formation resulted in periods of REM sleep without atonia: the cats exhibited all the manifestations of REM sleep except the abolition of muscle tone, indicating that intrapontine connections were essential to mediate and coordinate different aspects of the REM sleep period.
4. A Cellular and Molecular Model of the Sleep– Wake Cycle Microelectrode studies designed to determine the cellular mechanism of these effects have indicated that REM sleep is actively triggered by an extensive set of executive neurons ultimately involving at least the oculomotor, the vestibular, and the midbrain, pontine, and medullary reticular nuclei (see Fig. 2). The intrinsic excitability of this neuronal system appears to be lowest in waking, to increase progressively throughout NREM sleep, and to reach its peak during REM sleep. Cells in this system have thus been designated REM-on cells. In contrast, neurons in the aminergic brain stem nuclei (the serotoninergic dorsal raphe and the catacholaminergic locus coeruleus and peribrachial region) have reciprocally opposite activity curves. These aminergic REM-off cells fire maximally in waking, decrease activity in NREM sleep, and reach a low point at REM onset. The reciprocal interaction model of state control ascribes the progressive excitability of the executive (REM-on) cell population to disinhibition by the modulation (REM-off) cell population. How the aminergic systems are turned off is unknown, but Cliff Saper has recently suggested that GABA-ergic inhibition may arise from neurones in the hypothalamic circadian control system.
5. Cholinergic REM Sleep Mediation Confirming the early studies of Jouvet and Raul Hernandez-Peon, numerous investigations have found prompt and sustained increases in REM sleep signs when cholinergic agonist drugs are microinjected into the pontine brain stem of cats. The behavioral syndrome produced by this treatment is indistinguishable from the physiological state of REM sleep, except that it is precipitated directly from waking with no intervening NREM sleep. The cats can be aroused but revert to sleep immediately when stimulation abates. This drug-induced state has all the physiological signs of REM sleep (such as the low voltage fast EEG, EMG atonia, REM, and ponto-geniculo-occipital (PGO) waves), suggesting that it is a valid experimental model for physiological REM. Humans who are administered cholinergic agonist by intravenous injection during the first NREM cycle also show potentiation of REM sleep; this cholinergically enhanced REM sleep is associated, as usual, with dreaming. Mixed nicotinic and muscarinic agonists (e.g., carbachol), pure muscarinic agonists (e.g., bethanechol) and dioxylane are equally effective potentiators of REM sleep and recent results suggest that activation of the M2 acetylcholine receptor suffices to trigger REM sleep behavior. The inference that acetylcholine induces REM sleep is suggested by Helen Baghdoyan’s finding that neostigmine, (an acetylcholinesterase inhibitor which prevents the breakdown of endogenously released acetylcholine) also potentiates sleep after some delay. All of these effects are dosedependent and are competitively inhibited by the acetylcholine antagonist atropine. REM sleep induction is obtained only by injection of cholinergic agonists into the pons: Midbrain and medullary injections produce intense arousal with incessant turning and barrel-rolling behavior, respectively. Within the pons, the response pattern differs according to the site of injection, with maximal effects obtained from a region anterodorsal to the pontine reticular formation and bounded by the dorsal raphe and the locus coeruleus nuclei. In the peribrachial pons, where both aminergic and cholinergic neurons are intermingled, carbachol injection produces continuous PGO waves by activating the cholinergic PGO burst cells of the region. But agonist induction of these drug-induced waves is stateindependent and REM sleep is not potentiated in the
Figure 2 (continued) nitric oxide (see text) may contribute to the modulation of these interactions. Abbreviations: open circles, excitatory postsynaptic potentials; closed circles, inhibitory postsynaptic potentials; mPRF, medial pontine reticular formation; PPT, pedunculopontine tegmental nucleus; LDT, laterodorsal tegmental nucleus; LCα, peri-locus coeruleus alpha; 5HT, serotonin; NE, norepinephrine; ACh, acetylcholine; glut, glutamate; AS, aspartate; GABA, gamma-aminobutyric acid
14183
Sleep: Neural Systems first 24 hours following drug injection. It is only later that REM sleep increases reaching its three- to fourfold peak at 48–72 hours and thereafter remaining elevated for 6–10 days. By injection of carbachol into the sub- and peri-locus coeruleus regions of the pontine reticular formation, where tonic, neuronal activity can be recorded during REM sleep, atonia may be generated, while waking persists, suggesting a possible animal model of human cataplexy. Cholinergic cells of the PGO wave generator region of the peribrachial pons are known to project to both the lateral geniculate nucleus and the perigeniculate sectors of the thalamic reticular nucleus, where they are probably responsible for the phasic excitation of neurons in REM sleep. This supposition is supported by the finding that the PGO waves of the LGN are blocked by nicotinic antagonists. Wolf Singer’s recent work suggests the functional significance of these internally generated signals. Singer proposes that the resulting resonance contributes to the use-dependent plasticity of visual cortical networks. Besides providing evidence that REM sleep signs are cholinergically mediated, these recent pharmacological results provide neurobiologists with a powerful experimental tool, since a REM-sleeplike state can be produced at will. Thus Francisco Morales and Michael Chase have shown that the atonia produced by carbachol injection produces the same electrophysiological changes in lumbar motoneurons as does REM sleep: a decrease in input resistance and membrane time constant, and a reduction of excitability associated with discrete IPSPs. These findings facilitate the exploration of other neurotransmitters involved in the motorneuronal inhibition during REM sleep which is mediated, at least in part, by glycine. Another important advance is the intracellular recording of pontine reticular neurons in a slice preparation by Robert McCarley’s group. Neuronal networks can be activated cholinergically, producing the same electrophysiological changes as those found in REM sleep of intact animals. A low threshold calcium spike has been identified as the mediator of firing pattern alterations in brain-stem slice preparations. The activation process within the pons itself has also invited study, using carbachol as an inducer of REM sleep. Using the moveable microwire technique, Peter Shiromani and Dennis McGinty showed that many neurons normally activated in REM sleep either were not activated or were inactivated by the drug. KenIchi Yamamoto confirmed this surprising finding and found that the proportion of neurons showing REMsleeplike behavior was greatest at the most sensitive injection sites. To achieve greater anatomical precision in localizing the carbachol injection site, James Quattrochi conjugated the cholinergic agonist carbachol to fluorescent microspheres, thereby reducing the rate of 14184
diffusion tenfold and allowing all neurons projecting to the injection site to be identified by retrograde labeling. Surprisingly, the behavioral effects are no less potent despite the reduced diffusion rate of the agonist. They are maximized by injection into the same anterodorsal site described by Helen Baghdoyan. That site receives input from all the major nuclei implicated in control of the sleep cycle; the pontine gigantocellular tegmental field (glutamatergic), the dorsal raphe nuclei (serotonergic), the locus coeruleus (noradrenergic), and the dorsolateral tegmental and pedunculo-pontine nuclei (cholinergic). Thus the sensitive zone appears to be a point of convergence of the neurons postulated to interact reciprocally in order to generate the NREM– REM sleep cycle. Whether this is also a focal point of reciprocal projection back to those input sites can now be investigated.
6. Conclusions Because sleep occurs in a social context, and because sleep actively determines many aspects of that context, its study constitutes fertile ground for integration across domains of inquiry. At the basic science level, the cellular and molecular mechanisms emphasized here reach inexorably down, to the level of the genome. The upward extension of sleep psychophysiology to the psychology of individual consciousness, to dreaming, and hence to the mythology that guides cultures is also advancing rapidly. Altogether absent is a sound bridge to the social and behavioral realms where sleep has been ignored almost as if it were a non-behavior and hence devoid of social significance. Yet when, where, and especially with whom we sleep roots us profoundly in our domestic and interpersonal worlds. See also: Circadian Rhythms; Dreaming, Neural Basis of; Hypothalamus; Sleep and Health; Sleep Disorders: Psychiatric Aspects; Sleep Disorders: Psychological Aspects; Sleep States and Somatomotor Activity
Bibliography Hobson J A 1999 Consciousness. Scientific American Library Datta S, Calvo J, Quattrochi J, Hobson J A 1991 Long-term enhancement of REM sleep following cholinergic stimulation. NeuroReport 2: 619–22 Jones B E 1991 Paradoxical sleep and its chemical\structural substrates in the brain. Neuroscience 40: 637–56 Gerber U, Stevens D R, McCarley R W, Greene R W 1991 Muscarinic agonists activate an inwardly rectifying potassium conductance in medial pontine reticular formation neurons of the rat in itro. Journal of Neuroscience 11: 3861–7 Steriade M, Dossi R C, Nunez A 1991 Network modulation of a slow intrinsic oscillation of cat thalamocortical neurons implicated in sleep delta waves: Cortically induced synchroniz-
Small-group Interaction and Gender ation and brainstem cholinergic suppression. Journal of Neuroscience 11: 3200–17 Steriade M, McCarley R W 1990 Brainstem Control of Wakefulness and Sleep. Plenum Press, New York
J. A. Hobson
Small-group Interaction and Gender When people interact together in groups that are small enough to allow person-to-person contact (2 to 20 people), regular patterns of behavior develop that organize their relations. The study of gender in this context examines how these patterns of behavior are affected by members’ social locations as men or women and the consequences this has for beliefs about gender differences and for gender inequality in society.
1. The Emergence of the Field The systematic, empirical study of small group interaction and gender (hereafter, gender and interaction) developed during the 1940s and 1950s out of the confluence of a general social scientific interest in small groups and structural-functionalist theorizing about the origin of differentiated gender roles for men and women. Parsons and Bales (1955) argued that small groups, like all social systems, must manage instrumental functions of adaptation and goal attainment while also attending to expressive functions of group integration and the well-being of members. Differentiated instrumental and expressive gender roles develop to solve this problem in the small group system of the family. Children internalize these functionally specialized roles as personality traits that shape their behavior in all groups. Although eventually discredited on logical and empirical grounds, the functional account of gender roles stimulated broader empirical attention to gender and interaction, not only within the family but also in task-oriented groups such as committees and work groups. Evidence accumulated that gender’s effects on interaction are complex and quite context specific (see Aries 1996; Deaux and LaFrance 1998). This evidence is inconsistent with the view that gender is best understood as stable individual traits that affect men and women’s behavior in a consistent manner across situations. These findings on interaction contributed to a gradual transformation of social scientific approaches to gender. From an earlier view of gender as a matter of individual personality and family relations, social science has increasingly approached gender as a broad system of social difference and inequality that is best studied as part of social stratification as well as individual development and family organization.
Considering gender in a broader context draws attention to how the interactional patterns it entails are both similar to and different from those that characterize other systems of difference and inequality such as those based on race, ethnicity, or wealth. While interaction occurs between the advantaged and the disadvantaged on each of these forms of social difference, the rate of interaction across the gender divide is distinctively high. Gender divides the population into social categories of nearly equal size; it cross-cuts kin and households and is central for reproduction. Each of these factors increases the frequency with which men and women interact and the intimacy of the terms on which they do so. Furthermore, research on social cognition has shown that people automatically sex categorize (i.e., label as male or female) any concrete person with whom they interact. As a consequence, gender is potentially at play in all interaction and interactional events are likely to be important for the maintenance or change of a society’s cultural beliefs and practices about gender. West and Zimmerman (1987) argue that for gender to persist as a social phenomenon, people must continually ‘do gender’ by presenting themselves in ways that allow others to relate to them as men or women as that is culturally defined by society. Recognition of these distinctive aspects of gender has increased attention to the role that gendered patterns of interaction play in gender inequality. Interaction mediates the process by which people form social bonds, are socially evaluated by others, gain influence, and are directed towards or away from positions of power and valued social rewards. Interaction also provides the contexts in which people develop identities of competence and sociality. To the extent that gender moderates these interaction processes, it shapes the outcomes of men and women as well as shared beliefs about gender.
2. Current Theories Four theoretical approaches predominate. Two, social role theory and expectation states theory, single out a society’s cultural beliefs about the nature and social value of men’s and women’s traits and competencies as primary factors that create gendered patterns of interaction in that society. The theories conceptualize these beliefs in slightly different terms (i.e., as gender stereotypes or gender status beliefs) but are in substantial agreement about their explanatory impact on interaction through the expectations for behavior that develop when these beliefs are evoked by the group situation. The remaining theories take somewhat different approaches. 2.1 Social Role Theory Eagly’s (1987) social role theory argues that widely shared gender stereotypes develop from the gender 14185
Small-group Interaction and Gender division of labor that characterizes a society. In western societies, men’s greater participation in paid positions of higher power and status and the disproportionate assignment of nurturant roles to women have created stereotypes that associate agency with men and communion with women. In addition, the gendered division of labor gives men and women differentiated skills. When gender stereotypes are salient in a group because of a mixed sex membership or a task or context that is culturally associated with one gender, stereotypes shape behavior directly through the expectations members form for one another’s behavior. When group members enact social roles that are more tightly linked to the context than gender, such as manager and employee in the workplace, these more proximate roles control their behavior rather than gender stereotypes. Even in situations where gender stereotypes do not control behavior, however, men and women may still act slightly differently due to their gender differentiated skills. Social role theory has a broad scope that applies to interaction in all contexts and addresses assertive, power related behaviors as well as supportive or feeling related behaviors (called socioemotional behaviors). The explanations offered by the theory are not highly specific or detailed, however. The theory predicts that women will generally act more communally and less instrumentally than men in the same context, that these differences will be greatest when gender is highly salient in the situation, and that gender differences will be weak or absent when people enact formal, institutional roles.
2.2 Expectations States Theory Another major approach, Berger and co-worker’s expectation states theory, offers more detailed explanations within a narrower scope. The theory addresses the hierarchies of influence and esteem that develop among group members in goal-oriented contexts and makes predictions about when and how gender will shape these hierarchies due to the status value gender carries in society (see Ridgeway 1993; Wagner and Berger 1997). It does not address socioemotional behavior. Gender status beliefs are cultural beliefs that one gender (men) is more status worthy and generally more competent than the other (women) in addition to each having gender specific competencies. When gender status beliefs become salient due to the group’s mixed sex or gender associated context, they create implicit expectations in both men and women about the likely competence of goal oriented suggestions from a man compared to those from a similar woman. These often unconscious expectations shape men and women’s propensity to offer their ideas to the group, to stick with those ideas when others disagree, to positively evaluate the ideas of others, and 14186
to accept or resist influence from others, creating a behavioral influence hierarchy that usually advantages men over women in the group. In mixed sex groups with a gender-neutral task, the theory predicts that men will participate more assertively and be more influential than women. If the group task or context is culturally linked to men, their influence advantage over women will be stronger. If the task or context is associated with women’s culturally expected competencies, however, the theory predicts that women will be somewhat more assertive and influential than men. There should be no gender differences in assertive influence behavior between men and women in same sex groups with a genderneutral task, since gender status beliefs should not be salient.
2.3 Structural Identity Theories A set of symbolic interactionist theories, including Heise and Smith-Lovin’s affect control theory and Burke’s identity theory forms the structural identity approach (see Ridgeway and Smith-Lovin (1999) for a review). It, too, emphasizes shared cultural meanings about gender but focuses on the identity standards those beliefs create for individuals in groups. People learn cultural meanings about what it is to be masculine or feminine and these meanings become a personal gender identity standard that they seek to maintain through their actions. Identity standards act like control systems that shape behavior. If the context of interaction causes a person to seem more masculine or feminine than his or her gender identity standard, the person reacts with compensatory behaviors (e.g., warm behaviors to correct a too masculine impression). Consequently, different actions serve to express and maintain gender identities in different situational contexts. Since people automatically sex categorize one another, this approach assumes that gender identity standards affect behavior in all interaction, although the extent of their impact varies with gender’s salience in the context. Gender is often a background identity that modifies other, more situationally prominent identities, such as woman judge. Unlike the other theories, the predictions of structural identity theories focus primarily on the behavioral reactions gender produces to events in small groups.
2.4
Two-cultures Theory
Maltz and Borker’s (1982) two cultures theory, popularized by Tannen (1990), takes a different approach. It limits its scope to informal, friendly interaction. People learn rules for friendly conversation from peers in childhood, it argues. Since these peer groups tend to
Small-group Interaction and Gender be sex-segregated and because children exaggerate gender differences in the process of learning gender roles, boys and girls groups develop separate cultures that are gender-typed. Girls learn to use language to form bonds of closeness and equality, to criticize in nonchallenging ways, and to accurately interpret the intentions of others. Boys learn to use speech to compete for attention and assert positions of dominance. In adult mixed sex groups, these rules can cause miscommunication because men and women have learned to attribute different meanings to the same behavior. Men and women’s efforts to accommodate each other in mixed-sex interaction, however, modifies their behavior slightly, reducing gender differences. In same sex interaction, gendered styles of interaction are reinforced. Thus, two cultures theory predicts greater gender differences in behavior between men and women in same-sex groups than in mixed sex groups. The theory has been criticized for ignoring status and power differences between men and women and oversimplifying childhood interaction patterns (see Aries 1996).
3. Research Findings The body of systematic evidence about men and women’s behaviors in small group interaction is large and growing. Several methodological concerns must be kept in mind in order to interpret this evidence and infer general patterns.
3.1 Methodological Issues Interaction in small groups is an inherently local phenomenon that is embedded within larger sociocultural structures and affected by many aspects of those structures besides gender. Three methodological problems result. First, care is required to ascertain that behavioral differences between men and women in a situation are indeed due to their gender and not to differences in their other roles, power, or statuses in the situation. Second, reasonably large samples of specific interactional behaviors are necessary to infer gendered patterns in their use. Third, attention must be paid to the specific cultural context within in which the group is interacting. At present, almost all systematic research has been based on small groups in the US composed predominately of white, middle class people. Since several theories emphasize the importance of cultural beliefs about gender in shaping interaction, researchers must be alert to subcultural and cross-cultural variations in these beliefs and appropriately condition their empirical generalizations about gendered patterns of interaction. The available studies that compare US populations such African–Americans whose gender beliefs are less
polarized than the dominant beliefs find that gender differences in interaction are also less for these populations (Filardo 1996).
3.2 Empirical Patterns in North American Groups Taking these methodological concerns into account, narrative and meta-analytic reviews of research suggest several provisional conclusions about gender and interaction in groups governed by the dominant culture of North American society. Aries (1996), Deaux and La France (1998), and Ridgeway and Smith-Lovin (1999 provide reviews of the research on which these conclusions are based. Gender differences in behavior do not always occur in small groups and vary greatly by context. Behavioral expectations associated with the specific focus and institutional context of the small group (e.g., the workplace, a committee, a friendship group, a student group) generally are more powerful determinants of both men’s and women’s behavior than gender. When gender differences occur, they tend to be small or moderate in effect size, meaning that there is usually at least a 70 percent overlap in the distributions of men’s and women’s behavior. When men and women are in formal, prescribed roles with the same power and status, there are few if any differences in their behavior. Research has shown that men and women in equivalent leadership or managerial roles interact similarly with subordinates of either sex. On the other hand, when women are gender atypical occupants of positions of power, they are sometimes perceived by others as less legitimate in those roles and elicit more negative evaluations when they behave in a highly directive, autocratic way than do equivalent men. These findings are in accord with the predictions of social role theory and expectations states theory. Influence over others and assertive, goal-directed behavior such as participation rates, task suggestions, and assertive patterns of gestures and eye gaze are associated with power and leadership in small groups. In mixed sex groups with a gender-neutral task, men have moderately higher rates of assertive behaviors and influence than do women who are otherwise their peers. When the group task or context is culturally linked to men, this gender difference increases. When the task or context is one associated with women, however, women’s rates of assertive behaviors and influence are slightly higher than men’s. When performance information clearly demonstrates that women in a mixed sex group are as competent as the men, gender differences in assertiveness and influence disappear. In same sex groups, there are no differences between men’s and women’s rates of assertive behaviors or influence levels. These patterns closely match the predictions of expectations states theory, are 14187
Small-group Interaction and Gender consistent with social role theory, and inconsistent with two cultures theory. They suggest that gender status beliefs in society and the expectations for competence in the situation that they create are an important determinant of gender differences in power and assertiveness in groups, independent of men and women’s personalities or skills. They indicate as well that both men and women act assertively or deferentially depending on the situational context. Men, like women, show higher rates of socioemotional behaviors when they are in subordinate rather than superordinate positions in groups. These are verbal and nonverbal behaviors that support the speech of others, express solidarity, and show active, attentive listenership. In mixed sex groups, women engage in slightly more socioemotional behavior than men. However, women engage in the highest rates of socioemotional behaviors, and men the lowest, in same sex groups. The latter findings are the only ones in partial accord with the two-cultures theory. That theory, however, does not account for the partial association of socioemotional behaviors with lower status positions. Status factors alone, however, do not explain women’s increased socioemotional behaviors in female groups. Assertive, instrumental behaviors appear to reflect power, competence, and status equally for men and women and, thus, do not reliably mark gender identity for the actor. To the extent that people signal gender identity consistently across interaction contexts, they appear to do so primarily through socioemotional behaviors that are less associated with instrumental outcomes.
4. Conclusions Both the gender division of labor and gender inequality in a society depend on its cultural beliefs about the nature and social value of gender differences in competencies and traits. Such taken for granted beliefs allow actors to be reliably categorized as men and women in all contexts and understood as more or less appropriate candidates for different roles and positions in society. For such cultural beliefs to persist, people’s everyday interactions must be organized to support them. The empirical evidence from North America suggests that unequal role and status relationships produce many differences in interactional behavior that are commonly attributed to gender. Network research suggests that most interactions between men and women actually occur within the structural context of unequal role or status relations (see Ridgeway and Smith-Lovin 1999). These points together may account for the fact that people perceive gender differences to be pervasive in interaction, while studies of actual interaction show few behavioral differences between men and women of equal status 14188
and power. Small group interaction is an arena in which the appearance of gender differences is continually constructed through power and status relations and identity marking in the socioemotional realm. Theory and research on gender and interaction have focused on the way cultural beliefs about gender and structural roles shape interaction in ways that confirm the cultural beliefs. New approaches investigate the ways that interactional processes may perpetuate or undermine gender inequality in a society as that society undergoes economic change. If the cultural beliefs about gender that shape interaction change more slowly than economic arrangements, people interacting in gendered ways may rewrite gender inequality into newly emerging forms of socioeconomic organization in society. On the other hand, rapidly changing socioeconomic conditions may change the constraints on interaction between men and women in many contexts so that people’s experiences undermine consensual beliefs about gender and alter them over time. See also: Androgyny; Cultural Variations in Interpersonal Relationships; Feminist Theory; Gender and Feminist Studies in Psychology; Gender and Feminist Studies in Sociology; Gender and Language: Cultural Concerns; Gender Differences in Personality and Social Behavior; Gender Ideology: Crosscultural Aspects; Gender-related Development; Groups, Sociology of; Interactionism: Symbolic; Interpersonal Attraction, Psychology of; Language and Gender; Male Dominance; Masculinities and Femininities; Social Networks and Gender; Social Psychology: Sociological; Social Psychology, Theories of; Social Relationships in Adulthood; Stereotypes, Social Psychology of
Bibliography Aries E 1996 Men and Women in Interaction: Reconsidering the Differences. Oxford University Press, New York Deaux K, LaFrance M 1998 Gender. In: Gilbert D T, Fiske S T, Lindzey G (eds.) The Handbook of Social Psychology, 4th edn. McGraw-Hill, Boston, Vol. 1, pp. 788–827 Eagly A H 1987 Sex Differences in Social Behaior: A SocialRole Interpretation. Earlbaum, Hillsdale, NJ Filardo E K 1996 Gender patterns in African-American and white adolescents’ social interactions in same-race, mixed-sex groups. Journal of Personal Social Psychology 71: 71–82 Maltz D N, Borker R A 1982 A cultural approach to malefemale miscommunication. In: Gumperz J J (ed.) Language and Social Identity. Cambridge University Press, Cambridge, UK, pp. 196–256 Parsons T, Bales R F 1955 Family, Socialization, and Interaction Process. Free Press, Glencoe, IL Ridgeway C L 1993 Gender, status, and the social psychology of expectations. In: England P (ed.) Theory on Gender\Feminism on Theory. A de Gruyter, New York
Smith, Adam (1723–90) Ridgeway C L, Smith-Lovin L 1999 The gender system and interaction. Annual Reiew of Sociology 25: 191–216 Tannen D 1990 You Just Don’t Understand: Women and Men in Conersation. Morrow, New York Wagner D G, Berger J 1997 Gender and interpersonal task behaviors: Status expectation accounts. Sociological Perspecties 40: 1–32 West C, Zimmerman D 1987 Doing gender. Gender and Society 1: 125–51
C. L. Ridgeway
Smith, Adam (1723–90) Adam Smith was born in Kirkcaldy, in the County of Fife, and baptized on June 5, 1723 (the date of birth is unknown). He was the son of Adam Smith, Clerk to the Court-Martial, later Comptroller of Customs in the town, and of Margaret Douglas of Strathendry.
1. Early Years Smith entered Glasgow University in 1737—at the not uncommon age of 14. He was fortunate in his mother’s choice of University and in the period of his attendance. The old, nonspecialised, system of ‘regenting’ had been replaced in the late 1720s by a new arrangement whereby individuals professed a single subject. There is little doubt that Smith benefited from the teaching of Alexander Dunlop (Greek) and of Robert Dick (Physics). But two men in particular are worthy of note in view of Smith’s later interests. The first is Robert Simson (1687–1768) whom Smith later described as one of the ‘greatest mathematicians that I have ever had the honour to be known to, and I believe one of the two greatest that have lived in my time’ (TMS, III.2.20). Smith could well have acquired from Simson and Matthew Stewart (father of Dugald) his early and continuing interest in mathematics. Dugald Stewart recalled in his memoir (Stewart 1977) that Smith’s favorite pursuits while at university were mathematics and natural philosophy. Campbell and Skinner noted that ‘Simson’s interests were shared generally in Scotland. From their stress on Greek geometry the Scots built up a reputation for their philosophical elucidation of Newtonian fluxions, notably in the Treatise on Fluxions (1724) by Colin Maclaurin (1698–1746), another pupil of Simson’s who held chairs of mathematics in Aberdeen and Edinburgh (1982, p. 20). But important as it undoubtedly was Simson’s influence upon Smith pales by comparison with that exerted by Francis Hutcheson (1694–1746). A student of Gerschom Carmichael, the first Professor of Moral
Philosophy, Hutcheson succeeded him in 1729. A great stylist, Hutcheson lectured in English (rather than Latin) 3 days a week on classical sources and 5 days on Natural Religion, Morals, Jurisprudence, and Government. As Dugald Stewart was to observe, Hutcheson’s lectures contributed to diffuse, in Scotland, the taste for analytical discussion, and a spirit of liberal enquiry. It is believed that Smith graduated from Glasgow in 1740 and known that he was elected to the Snell Exhibition (scholarship) in the same year. He matriculated in Balliol College on July 7, and did not return to Scotland until 1746—the year that ended the Jacobite Rebellion and saw the death of Hutcheson. The 6 years spent in Oxford were often unhappy. Smith had a retiring personality, his health was poor, and the College was pro-Jacobite and ‘anti-Scottish.’ Some members were also ‘unenlightened’; a fact confirmed by the confiscation by one tutor of David Hume’s Treatise on Human Nature. And yet Smith was to write to the Principal of Glasgow University (Archibald Davidson), on the occasion of his election as Lord Rector (letter 274, dated November 16, 1787): No man can owe greater obligations than I do to the University of Glasgow. They educated me, they sent me to Oxford, soon after my return to Scotland they elected me one of their members, and afterwards preferred me to another office, to which the abilities and virtues of the never to be forgotten Dr Hutcheson had given a superior degree of illustration.
The reference to Oxford was no gilded memory. Balliol had one of the best libraries in Oxford. Smith’s occasional ill-health may be explained by his enthusiastic pursuit of its riches. It has been assumed that Smith developed an interest in Rhetoric and Belles Lettres during this period, although it now seems likely, in view of his later career, that he also further developed longer standing interests in literature, science, ethics, and jurisprudence.
2. Professor in Glasgow Smith returned to Kirkcaldy in 1746 without any fixed plan. But his wide-ranging interests must have been known to his friends, three of whom arranged a program of public lectures which were delivered in Edinburgh between 1748 and 1751. The three friends were Robert Craigie of Glendoick, James Oswald of Dunnikier, and Henry Home, Lord Kames. The lectures were of an extramural nature and delivered to a ‘respectable auditory.’ They probably also included material on the history of science (astronomy), jurisprudence, and economics. The success of Smith’s courses in Edinburgh no doubt led to his appointment to the Glasgow Chair of Logic and Rhetoric in 1751—where once again he 14189
Smith, Adam (1723–90) enjoyed the support of Henry Home, Lord Kames. Evidence gathered from former pupils confirms that Smith did lecture on rhetoric but also that he continued to deploy a wide range of interests, leading to the conclusion that he continued to lecture on the history of philosophy and science in the Glasgow years (Mizuta 2000, p. 101). Smith attached a great deal of importance to his essay on the history of Astronomy (Corr, letter 137) the major part of which may well have been completed soon after leaving Oxford (Ross 1995, p. 101). Adam Smith was translated to Hutcheson’s old chair of Moral Philosophy in 1752. As John Millar recalled to Dugald Stewart, the course was divided into four parts: natural theology, ethics, jurisprudence, and expediency (economics). Millar also confirmed that the substance of the course on ethics reappeared in The Theory of Moral Sentiments (1759) and that the last part of the course featured in the Wealth of Nations (1776). We know from Smith’s own words that he hoped to complete his wider plan by writing a ‘sort of Philosophical History of all the different branches of literature, of philosophy poetry and eloquence’ together with ‘a sort of theory and history of law and government’ (Corr, letter 248). Smith returned to this theme in the advertisement to the sixth and final edition of TMS in noting that TMS and WN were parts of a wider study. ‘What remains the theory of jurisprudence, which I have long projected, I have hitherto been hindered from executing, by the same occupations which had till now prevented me from revising the present work.’ But the outlines of the projected study are probably revealed by the content of LJ(A) and LJ(B), and by those passages in WN which can now be recognized as being derived from them (e.g., WN, III, and V.i.a.b). The links between the parts of the great plan are many and various. The TMS, for example, may be regarded as an exercise in social philosophy, which was designed in part to show the way in which so selfregarding a creature as man, erects (by natural as distinct from artificial means) barriers against his own passions, thus explaining the observed fact that he is always found in ‘troops and companies.’ The argument places a good deal of emphasis on the importance of general rules of behaviour which are related to experience and which may thus vary in content, together with the need for some system of government as a precondition of social order. The historical analysis, with its four socioeconomic stages, complements this argument by formally considering the origin of government and by explaining to some extent the forces which cause variations in accepted standards of behavior over time. Both are related in turn to Smith’s treatment of political economy. The most polished accounts of the emergent economy and of the psychology of the ‘economic man’ are to be found, respectively, in the third book of WN and 14190
in Part VI of TMS that was added in 1790. Yet both areas of analysis are old and their substance would have been communicated to Smith’s students and understood by them to be what they possibly were: a preface to the treatment of political economy. From an analytical point of view, Smith’s treatment of economics in the final part of LJ was not lacking in sophistication, which is hardly surprising in view of his debts to Francis Hutcheson, and through him, to Gerschom Carmichael, and thus to Pufendorf.
3. The French Connection Despite his anxieties, the TMS proved to be successful (Corr, letter 31) and attracted the attention of Charles Townshend, the statesman, who set out to persuade Smith to become tutor to the young Duke of Buccleuch. Smith eventually agreed, and left Glasgow with two young charges to begin a stay of almost 2 years in France. Smith resigned his Chair in February 1764 (Corr, letter 87). The party spent many months in Toulouse before visiting Avignon and Geneva (where Smith met the much-admired Voltaire) prior to reaching Paris to begin a stay of some 10 months. From an intellectual point of view, the visit was a resounding success and arguably influential in the sense that Smith was able to meet, amongst many others, Francois Quesnay and A. R. J. Turgot. Quesnay’s economic model dates from the late 1750s but it is noteworthy that he was working on a new version of the Tableau, the Analyse, during the course of Smith’s visit. It is also known that Smith met Turgot at this time and that the latter was at work on the Reflections. While Smith wrote a detailed commentary on Physiocratic teaching (WN IV.ix), his lectures on economics, as delivered in the closing months of 1763, did not include a model of the kind which he later associated with the French Economists—thus suggesting that he must have found a great deal to think about in the course of 1766. But there were more immediate concerns. The young Duke was seriously ill in the summer, leading Smith to call upon the professional services of Quesnay. Smith’s distress was further compounded by the illness and death of the younger brother, Hew Scott. The visit to Paris was ended abruptly. Smith spent the winter of 1766 in London and returned to Kirkcaldy in the following Spring to begin a stay of more than 6 years during which he worked upon successive drafts of the WN.
4. London 1773–76 Smith left Kirkcaldy in the spring of 1773 to begin a stay of some 3 years in London. David Hume assumed that publication of WN was imminent, but in fact, the
Smith, Adam (1723–90) book did not appear until March 1776. The reason may well be that Smith was engaged in systematic revision of a complex work. But he was also concerned with two sophisticated areas of analysis. The first of these questions related to Smith’s interest in the optimal organization of public services, notably education, and features in a long letter addressed to William Cullen dated September 1774. The issue of public services was one of the topics that Smith addressed during his stay in London. Cullen had written to Smith seeking his opinion on proposals from the Royal College of Physicians in Edinburgh. The petition suggested that doctors should be graduates, that they should have attended College for at least 2 years, and that their competence be confirmed by examination. Smith opposed this position, arguing that universities should not have a monopoly of provision and in particular that the performance of academic staff should be an important factor in determining salaries. In sum, an efficient system of higher education requires: a state of free competition between universities and private teachers; a capacity effectively to compete in the market for men of letters; freedom of choice for students as between teachers, courses, and colleges, together with the capacity to be sensitive to market forces, even if those forces were not always themselves sufficient to ensure the provision of the basic infrastructure. Smith also argued that education should be paid for through a combination of private and public funding (WN, V.i.i.5). Smith was arguing in favour of the doctrine of ‘induced efficiency’ and applied the principles involved to the whole range of public services (Skinner 1996, chap. 8). At the same time, Smith also addressed another complex question, which was considered at great length in WN (Book IV, vii), namely the developing tensions with the American Colonies. Indeed, Hume believed that Smith’s growing preoccupation with the Colonial Question was the cause of the delay in the publication of the book (Corr, letter 139). Smith’s long analysis of the issues involved provides the centerpiece of his critique of the ‘mercantile’ system of regulation (WN, IV, v.a.24). On Smith’s account the Regulating Acts of Trade and Navigation in effect confined the Colonies to the production of primary products, while the ‘mother country’ concentrated on more refined manufacturers—thus creating a system of complementary markets which benefited both parties. But it was Smith’s contention that the British economy could not sustain indefinitely the cost of colonial administration and that in the long run the rate of growth in America would come into conflict with restrictions currently imposed. Smith, therefore, argued that Great Britain should dismantle the Acts voluntarily with a view to creating a single free trade area, in effect an Atlantic Economic Community, with a harmonized system of taxation, possessing all the advantages of a common language and culture.
Smith accepted the principle that there should be no taxation without representation. In this connection, he noted that: In the course of little more than a century perhaps the produce of American might exceed that of British taxation. The seat of empire would then naturally remove itself to that part of the empire which contributed most to the general defence of the whole (WN IV, vii.c.79).
Hugh Blair, Professor of Rhetoric in Edinburgh, objected to Smith’s inclusion of this material on the grounds that ‘it is too much like a publication for the present moment’ (Corr, 151). Others were more perceptive, recognizing that Smith was applying principles drawn from his wider system to the specific case of America (e.g., Thomas Pownall, Corr, ap.A). Pownall, a former Governor of Massachusetts, was an acute critic of Smith’s position (Skinner 1996, chap. 8) but recognized that Smith had sought to produce ‘an institute of the Principia of those laws of motion, by which the operations of the community are directed and regulated, and by which they should be examined’ (Corr p. 354).
5. The Wealth of Nations Dugald Stewart, Professor of Moral Philosophy in Edinburgh, noted that: ‘It may be doubted, with respect to Mr Smith’s Inquiry, if there exists any book beyond the circle of the mathematical and physical sciences, which is at once so agreeable in its arrangement to the rules of a sound logic and so accessible to the examination of ordinary readers’ (Stewart 1977, IV22). This is a complement which Smith would have appreciated, conscious as he was of the ‘beauty of a systematical arrangement of different observations connected by a few common principles’ (WN, V.i.f. 25). But there are really two systems in WN. The first is a model of ‘conceptualised reality’ (Jensen 1984) which provides a description of a modern economy, which owed much to the Physiocrats. The second system is analytical and builds upon the first. 5.1 A Model of Conceptualized Reality If the Theory of Moral Sentiment provides an account of the way in which people erect barriers against their own passions, thus meeting a basic precondition for economic activity, it also provided an account of the psychological judgments on which that activity depends. The historical argument on the other hand explains the origins and nature of the modern state and provides the reader with the means of understanding the essential nature of the exchange economy. For Smith: ‘the great commerce of every civilised society, is that carried on between the inhabitants of 14191
Smith, Adam (1723–90) the town and those of the country ... The gains of both are mutual and reciprocal, and the division of labour is in this, as in all other cases, advantageous to all the different persons employed in the various occupations into which it is subdivided’ (WN, III.i.1). The concept of an economy involving a flow of goods and services, and the appreciation of the importance of intersectoral dependencies, were familiar in the eighteenth century. Such themes are dominant features of the work done, for example, by Sir James Steuart and David Hume. But what is distinctive about Smith’s work, at least as compared to his Scottish contemporaries, is the emphasis given to the importance of three distinct factors of production (land, labor, capital) and to the three categories of return (rent, wage, profit) which correspond to them. What is distinctive to the modern eye is the way in which Smith deployed these concepts in providing an account of the flow of goods and services between the sectors involved and between the different socioeconomic groups (proprietors of land, capitalists, and wage-labor). The approach is also of interest in that Smith, following the lead of the French Economists, worked in terms of period analysis— typically the year was chosen, so that the working of the economy is examined within a significant time dimension as well as over a series of time periods. Both versions of the argument emphasise the importance of capital, both fixed and circulating.
5.2 A Conceptual Analytical System The ‘conceptual’ model which Smith had in mind when writing the Wealth of Nations is instructive and also helps to illustrate the series of separate, but interrelated problems, which economists must address if they are to attain the end which Smith proposed, namely an understanding of the full range of problems which have to be encountered. Smith, in fact, addressed a series of areas of analysis which began with the problem of value, before proceeding to the discussion of the determinants of price, the allocation of resources between competing uses, and, finally, an analysis of the forces which determine the distribution of income in any one time period and over time. The analysis offered in the first book enabled Smith to proceed directly to the treatment of macroeconomic issues and especially to a theory of growth which provides one of the dominant features of the work as a whole (c.f. Skinner 1996, chap. 7). The idea of a single, all-embracing conceptual system, whose parts should be mutually consistent is not an ideal which is so easily attainable in an age where the division of labor has increased significantly the quantity of science through specialization. But Smith becomes even more informative when we map the content of the ‘conceptual (analytical) system’ against a model of the economy, which is essentially descriptive. 14192
Perhaps the most significant feature of Smith’s vision of the ‘economic process,’ to use Blaug’s phrase, lies in the fact that it has a significant time dimension. For example, in dealing with the problems of value in exchange, Smith, following Hutcheson, made due allowance for the fact that the process involves judgments with regard to the utility of the commodities to be received, and the disutility involved in creating the commodities to be exchanged. In the manner of his predecessors, Smith was aware of the distinction between utility (and disutility) anticipated and realized, and, therefore, of the process of adjustment which would take place though time. Young (1997, p. 61) has emphasized that the process of exchange may itself be a source of pleasure (utility). In an argument which bears upon the analysis of the TMS, Smith also noted that choices made by the ‘rational’ individual may be constrained by the reaction of the spectator of his conduct—a much more complex situation than that which more modern approaches may suggest. Smith makes much of the point in his discussion of Mandeville’s ‘licentious’ doctrine that private vices are public benefits, in suggesting that the gratification of desire is perfectly consistent with observance of the rules of propriety as defined by the ‘spectator,’ that is, by an external agency. In an interesting variant on this theme, Etzioni (1988, pp. 21–4) noted the need to recognize ‘at least two irreducible sources of valuation or utility; pleasure and morality.’ He added that modern utility theory ‘does not recognise the distinct standing of morality as a major, distinct, source of valuations’ and hence as an explanation of ‘behaviour’ before going on to suggest that his own ‘deontological multi-utility model’ is closer to Smith than other modern approaches. Smith’s theory of price, which allows for a wide range of changes in taste, is also distinctive in that it allows for competition among and between buyers and sellers, while presenting the allocative mechanism as one which involves simultaneous and interrelated adjustments in both factor and commodity markets. As befits a writer who was concerned to address the problems of change and adjustment, Smith’s position was also distinctive in that he was not directly concerned with the problem of equilibrium. For him the ‘natural’ (supply) price was: ‘as it were, the central price, to which the prices of all commodities are continually gravitating ... whatever may be the obstacles which hinder them from settling in this center of repose and continuance, they are constantly tending towards it’ (WN, I.vii.15). The picture was further refined in the sense that Smith introduced into this discussion the doctrine of ‘net advantages’ (WN, I.x.a.1). This technical area is familiar to labor economists, but in Smith’s case it becomes even more interesting in the sense that it provides a further link with the TMS, and with the discussion of constrained choice. It was Smith’s contention that men would only be prepared to
Smith, Adam (1723–90) embark on professions that attracted the disapprobation of the spectator if they could be suitably compensated (Skinner 1996, p. 155) in terms of monetary reward. But perhaps the most intriguing feature of the macroeconomic model is to be found in the way in which it was specified. As noted earlier, Smith argued that incomes are generated as a result of productive activity, thus making it possible for commodities to be withdrawn from the ‘circulating’ capital of society. As he pointed out, the consumption of goods withdrawn from the existing stock may be used up in the present period, or added to the stock reserved for immediate consumption, or used to replace more durable goods which had reached the end of their lives in the current period. In a similar manner, undertakers and merchants may add to their stock of materials, or to their holdings of fixed capital while replacing the plant which had reached the end of its operational life. It is equally obvious that undertakers and merchants may add to, or reduce, their inventories in ways that will reflect the changing patterns of demand for consumption and investment goods, and their past and current levels of production. Smith’s emphasis upon the point that different ‘goods’ have different life-cycles means that the pattern of purchase and replacement may vary continuously as the economy moves through different time periods, and in ways which reflect the various age profiles of particular products as well as the pattern of demand for them. If Smith’s model of the ‘circular flow’ is to be seen as a spiral, rather than a circle, it soon becomes evident that this spiral is likely to expand (and contract) through time at variable rates. It is perhaps this total vision of the complex working of the economy that led Mark Blaug to comment on Smith’s sophisticated grasp of the economic process and to distinguish this from his contribution to particular areas of economic analysis (c.f. Jensen 1984, Jeck 1994, Ranadive 1984). What Smith had produced was a model of conceptualized reality, which is essentially descriptive, and which was further illuminated by an analytical system which was so organized as to meet the requirement of the Newtonian model (Skinner 1996, chap.7). Smith’s model(s) and the way in which they were specified confirmed his earlier claim that government’s ought not to interfere with the economy—a theme stated in the ‘manifesto of 1775’ (Stewart 1977, IV.25), repeated in LJ, confirmed by Turgot, and even more eloquently defended in WN (IV.ix.51). Smith would no doubt be gratified that Hume had lived to see the publication of WN and pleased with the assessment that it has ‘Depth and Solidity and Acuteness’ (Corr, letter 150). Hume died in the summer of 1776. Two years later Smith was asked by a former pupil, Alexander Wedderburn, SolicitorGeneral in Lord North’s administration, to advise on the options open to the British Government in the
aftermath of Bourgoyne’s surrender at Saratoga. Smith returned to his old theme of Union, but recognized that the most likely outcome was military defeat (Corr, app B). In February of the same year, Smith was appointed Commissioner of Customs and of the Salt Duties that gave him an income of £600pa to be added to the £300 which he still received from the Duke. Smith then moved to Edinburgh where he lived with his mother and a cousin. He died in 1790 after instructing his executors, Joseph Black and James Hutton, to burn the bulk of his papers. We may believe that the pieces which survived (which include the Astronomy) and which were later published in the Essays on Philosophical Subjects, were all specified by Smith.
6. Influence Adam Smith’s influence upon his successors is a subject worthy of a book rather than a few concluding paragraphs. But there are some obvious points to be made even if the list can hardly be exhaustive. If we look at the issues involved from the standpoint of economic theory and policy, the task becomes a little simpler. On the subject of policy, Teichgraeber has noted that Smith’s advocacy of free trade or economic liberalism did not ‘register any significant victories during his life-time’ (1987, p. 338). Indeed, Tribe has argued that ‘until the final decade of the eighteenth century, Sir James Steuart’s Inquiry was better known than Smith’s The Wealth of Nations’ (1988, p. 133). The reason is that Steuart’s extensive and unique knowledge of conditions in Europe, gained as a result of exile, made him acutely aware of problems of immediate relevance; problems such as unemployment, regional imbalance, underdeveloped economies and the difficulties which were presented in international trade as a result of variations in rates of growth (Skinner 1996, chap. 11). In view of later events, it is ironic to note that Alexander Hamilton considered Steuart’s policy of protection for infant industries to be more relevant to the interests of the young American Republic than the ‘fuzzy philosophy of Smith’ (Stevens 1975, pp. 215–7). But the situation was soon to change. In the course of a review of the way in which WN had been received, Black noted that: On the side of policy, the general impression left by the historical evidence is that by 1826 not only the economists but a great many other influential public men were prepared to give assent and support to the system of natural liberty and the consequent doctrine of free trade set out by Adam Smith’ (1976, p. 47).
Black recorded that the system of natural liberty attracted attention on the occasion of every anni14193
Smith, Adam (1723–90) versary. But a cautionary note was struck by Viner (1928). Having reviewed Smith’s treatment of the function of the state in Adam Smith and Laisser-Faire, a seminal article, Viner concluded: Adam Smith was not a doctrinaire advocate of laisser-faire. He saw a wide and elastic range of activity for government, and was prepared to extend it even further if government, by improving its standards of competence, honesty and public spirit, showed itself entitled to wider responsibilities’ (Wood 1984, i.164).
But this sophisticated view, which is now quite general, does not qualify Robbins’ point that Smith developed an important argument to the effect that economic freedom ‘rested on a two fold basis: belief in the desirability of freedom of choice for the consumer and belief in the effectiveness, in meeting this choice, of freedom on the part of the producers’ (Robbins 1953, p. 12). Smith added a dynamic dimension to this theme in his discussion of the Corn Laws (WN, IV.v.b). The thesis has proved to be enduringly attractive. Analytically, the situation is also intriguing. Teichgraeber’s research revealed that there ‘is no evidence to show that many people exploited his arguments with great care before the first two decades of the nineteenth century (1987, p. 339). He concluded: ‘It would seem at the time of his death that Smith was widely known and admired as the author of the Wealth of Nations. Yet it should be noted too that only a handful of his contemporaries had come to see his book as uniquely influential’ (1987, p. 363). Black has suggested that for Smith’s early nineteenth-century successors, the WN was ‘not so much a classical monument to be inspected, but as a structure to be examined and improved where necessary’ (1984, p. 44). There were ambiguities in Smith’s treatment of value, interest, rent, and population theory. These ambiguities were reduced by the work of Ricardo, Malthus, James Mill, and J. B. Say, making it possible to think of a classical system dominated by short-run self-equilibrating mechanisms and a longrun theory of growth. But there was one result of which Smith would not have approved in that the classical orthodoxy made it possible to think of economics as quite separate from ethics and history. In a telling passage reflecting upon the order in which Smith developed his argument (ethics, history, economics), Hutchison concluded that Smith was unwittingly led, as if by an Invisible Hand, to promote an end which was no part of his intention, that ‘of establishing political economy as a separate autonomous discipline’ (1988, p. 355). But the economic content of WN did, after all, provide the basis of classical economics in the form of a coherent, all-embracing account of ‘general interconnexions’ (Robbins 1953, p. 172). As Viner had earlier pointed out, the source of Smith’s originality lies in his ‘detailed and elaborate application to the 14194
wilderness of economic phenomena of the unifying concept of a co-ordinated and mutually interdependent system of cause and effect relationships which philosophers and theologians had already applied to the world in general’ (Wood 1984, i.143). Down the years it is the idea of system which has attracted sustained attention perhaps because it is now virtually impossible to duplicate a style of thinking which becomes more informative the further we are removed from it. No one who is familiar with the Smithian edifice can fail to notice that he thought mathematically and in a manner which reflects his early interest in related disciplines, including the life sciences—all mechanistic, evolutionary, static, and dynamic, which so profoundly affected the shape assumed by his System of Social Science. Works of Adam Smith (OUP, 1976–1983) Corr Correspondence, eds E C Mossner and I S Ross, (1977) Early Draft of WN in W R Scott, Adam Smith as Student and Professor, (Jacksons, Glasgow, 1937) EPS Essays on Philosophical Subjects, incl. the Astronomy, General eds D D Raphael and A S Skinner (1980) Letter Letter to Authors of the Edinburgh Review (1756), in EPS LJ(A) and LJ(B) Lectures on Jurisprudence ed R L Meek, P G Stein and D D Raphael (1978) LRBL Lectures on Rhetoric and Belles Letters, ed J C Bryce (1983) TMS The Theory of Moral Sentiments, ed D D Raphael and A L Macfie (1976) WN The Wealth of Nations, eds R H Campbell, A S Skinner and W B Todd (1976) Stewart Account of the Life and Writings of Adam Smith (1977) in Corr.
Bibliography Black R D C 1976 Smith’s contribution in historical perspective. In: Wilson T, Skinner A S (eds.) The Market and the State. Oxford University Press, Oxford, UK Campbell R H, Skinner A S 1982 Adam Smith. Croom-Helm, London Etzioni A 1988 The Moral Dimension: Towards a New Economics. Macmillan, London Groenwegen P 1969b Turgot and Adam Smith. Scottish Journal of Political Economy 16
Smoking and Health Hutchison T 1988 Before Adam Smith. Blackwell, Oxford, UK Jeck A 1994 The macro-structure of Adam Smith’s theoretical system. European Journal of the History of Economic Thought. 3: 551–76 Jensen H E 1984 Sources and Contours in Adam Smith’s Conceptualised Reality. Wood, ii.194 Meek R L 1962 The Economics of Physiocracy. Allen and Unwin, London Meek R L 1973 Turgot on Progress, Sociology and Economics. Cambridge, UK Mizuta H 2000 Adam Smith: Critical Responses. Routledge, London Ranadive K R 1984 The wealth of nations: the vision and conceptualisation. In: Wood J C (ed.) Adam Smith: Critical Assessments, ii. 244 Robbins L 1953 The Theory of Economic Policy in English Classical Political Economy. Macmillan, London Ross I S 1995 Life of Adam Smith. Oxford University Press, Oxford, UK Skinner A S 1996 A System of Social Science: Papers Relating to Adam Smith, 2nd edn. Oxford University Press, Oxford, UK Stevens D 1975 Adam Smith and the colonial disturbances. In: Skinner A S, Wilson T (eds.) Essays on Adam Smith. Oxford University Press, Oxford, UK Teichgraeber R 1987 Less abused than I had reason to expect: The reception of the Wealth of Nations in Britain 1776–1790. Historical Journal 80: 337–66 Tribe K P 1988 Goerning Economy: The Reformation of German Economic Discourse, 1750–1840. Cambridge University Press, Cambridge, UK Viner J 1928 Adam Smith and Laissez-Faire. Journal of Political Economy 35: 143–67 Wood J C 1984 Adam Smith: Critical Assessments. CroomHelm, Beckenham, UK Young J 1977 Economics as a Moral Science: The Political Economy of Adam Smith. Edward Elgar, Cheltenham, UK
adequately characterized the interactions of these determinants. Social and behavioral scientists have also elucidated important mechanisms by which both private sector and governmental intervention can discourage the initiation of smoking and encourage quitting. After a brief description of the burden of smoking and how epidemiological science unearthed the principal tobacco-related disease connections, the causes of tobacco consumption are summarized. The efforts of social and behavioral scientists to improve the ability of smokers to quit smoking are then described, including better theoretical characterization of the determinants of quitting success and the development of more effective cessation interventions. Next examined is how social and behavioral scientists have analyzed the effects of selected important tobacco control measures, and thereby contributed to the formulation and implementation of tobacco control programs and policies. Finally, the future contribution of the social and behavioral sciences to dealing with a potentially even more cataclysmic crisis in world health during the twenty-first century is considered. Traditional methods of youth smoking prevention, such as school health education and enforcement of minimum age-of-purchase laws, are only touched on due in part to limited scientific understanding of how initiation can be effectively discouraged, and due to its coverage elsewhere (Lantz et al. 2000; see Substance Abuse in Adolescents, Preention of; Smoking Preention and Smoking Cessation; Health Promotion in Schools).
A. S. Skinner
2. Cigarette Smoking and Disease: Making the Connection
Smoking and Health 1. Introduction Worldwide, 1.1 billion people use tobacco products, primarily cigarettes. An estimated four million die annually as a consequence. By the year 2030, an estimated 1.6 billion will consume tobacco and tobacco’s death toll will have skyrocketed to 10 million per year, making tobacco the leading behavioral cause of preventable premature death throughout the world. Tobacco already claims that dubious distinction in the world’s developed countries (World Health Organization 1997). The causes of this pandemic of avoidable disease and death are a complex web of physiological, psychological, social, marketing, and policy factors. Social and behavioral scientists of all disciplinary stripes have contributed to identifying and disentangling these factors, although no single model has yet
In the major industrialized nations, smoking causes from a sixth to a fifth of all mortality. The two major smoking-related causes of death are lung cancer and coronary heart disease (CHD). In the United States, epidemiologists estimate that smoking accounts for approximately 90 percent of all lung cancer deaths, with lung cancer the leading cancer cause of death in both men and women. Smoking is credited with better than a fifth of CHD deaths. In addition, smoking causes four-fifths of all chronic obstructive pulmonary disease mortality and just under a fifth of all stroke deaths (US Department of Health and Human Services 1989). The exposure of nonsmokers to environmental tobacco smoke is also a cause of disease and death (American Council on Science and Health 1999). The toll of smoking is proportionately smaller in developing countries, reflecting the more recent rise and lesser intensity of smoking to this point. However, projections indicate a future chronic disease epidemic quite comparable to that now experienced in the world’s more affluent nations (World Health Organization 1997). 14195
Smoking and Health Although the health hazards of tobacco smoking have been suspected for centuries, serious interest is a twentieth-century phenomenon, coincident with the advent of cigarette smoking. Prior to the twentieth century, tobacco was smoked primarily in pipes and cigars, chewed, or used nasally as snuff. Harsh tobaccos made deep inhalation of tobacco smoke difficult. Tobacco likely exacted a modest toll through the nineteenth century. In the US, four factors combined in the early twentieth century to make cigarette smoking the most popular, and lethal, form of tobacco consumption. Perfection of the Bonsack cigarette rolling machine in 1884 introduced relatively inexpensive and neater cigarettes to the market. Discovery of the American blend of tobaccos (combining more flavorful tobaccos from Turkey and Egypt with milder American tobaccos) made deep inhalation of cigarette smoke feasible. The development and marketing of Camel cigarettes in 1913, the first American blend cigarette, introduced this new-generation product to the American public with an advertising campaign often credited as inaugurating modern advertising. Finally, cigarettes were included in soldiers’ rations during World War I, permitting soldiers a quick and convenient battlefield tobacco break. Considered effeminate and unattractive prior to the war, cigarette smoking came home a ‘manly’ behavior. Since then, cigarette smoking has dominated all other forms of tobacco use by far in the US and in most countries of the world. Lung cancer was virtually unknown at the beginning of the twentieth century. By 1930, it had become the sixth leading cause of cancer death in men in the United States. Less than a quarter of a century later, lung cancer surpassed colorectal cancer to become the leading cause of male cancer death. It achieved the same status among American women in the mid1980s, surpassing breast cancer (US Department of Health and Human Services 1989). The rapidly growing rate of lung cancer in the early decades of the century spurred a number of epidemiologists and laboratory scientists to begin investigating the relationship between smoking and cancer. In 1950, Wynder and Graham published a now classic retrospective epidemiologic analysis that strongly linked the behavior to the disease. Throughout the decade, a series of articles documented similar findings from major prospective (cohort) mortality studies in the US and the UK. Scientific groups in both countries soon thereafter published seminal public health documents indicating smoking as the major cause of lung cancer and a cause of multiple other diseases (Royal College of Physicians of London 1962, US Public Health Service 1964, US Department of Health and Human Services 1989). By the end of the twentieth century, approximately 70,000 studies in English alone associated cigarette smoking with a wide variety of malignant neoplasms and cardiovascular and pulmonary diseases, as well as 14196
numerous other disorders. The modern plague of tobacco-produced disease is now rapidly metastasizing—both figuratively and literally—to the world’s poorer nations, where increasing affluence and western image-making have combined to place cigarettes in the mouths of a sizable majority of men and a growing minority of women.
3. The Causes of Smoking Smoking affords users a very mild ‘high’ and addresses a number of often seemingly competing physical and psychological needs, for relaxation or stimulation, distraction or concentration, for example. The crucial ingredient in the sustained use of cigarettes is the addictiveness of the nicotine in tobacco smoke. Once addicted, as most regular smokers are, smoking serves the physiological purpose of avoiding nicotine withdrawal. Nicotine affects brain receptors in much the same manner as other addictive substances, and withdrawal shares many of the same unpleasant characteristics (US Department of Health and Human Services 1988). In many countries, smoking is initiated during the teen and even pre-teen years. At the time of initiation, new smokers typically do not appreciate the nature of addiction, much less the addictiveness of nicotine. Further, they tend to be short sighted, unconcerned about potential distant adverse health consequences (US Department of Health and Human Services 1994). The combination translates into a large pool of new smokers who have adopted the behavior, and become addicted, without considering either the danger or addictiveness of smoking, a situation that most will come to regret. The physical effects of nicotine notwithstanding, history teaches that virtually all drug use is socially conditioned; tobacco smoking is no exception. The use of tobacco in the Americas prior to the arrival of Europeans demonstrates this vividly. Although tobacco played a prominent role in most native societies, in some tobacco smoking was restricted exclusively to the shaman, who used it for medicinal and relig-ious purposes. In other societies, tobacco smoking anointed official nonsectarian functions of tribal leaders (such as the famous ‘peace pipe’). In still others, nearly all males smoked tobacco frequently, for personal and social reasons (Goodman 1993). In contemporary society, the initiation of cigarette smoking is often viewed as a rite of passage to adulthood. In many poorer societies, smoking is seen as a symbol of affluence. In virtually all cases, smoking results from role modeling, of parents, prominent celebrities, or youthful peers. Many tobacco control advocates attribute much of smoking’s holding power to sophisticated tobacco industry marketing campaigns. In the United States alone, the industry spends over $5 billion per year on
Smoking and Health advertising and other forms of marketing. The industry insists that the purpose of its advertising is solely to vie for shares of the existing market of adult smokers, a claim greeted with derision by the public health community. With US smokers exhibiting strong brand loyalty and two companies controlling threequarters of the US market, marketing directed at brand switching (or its defensive analog, maintaining brand loyalty) would appear to be a relatively fruitless investment. Tobacco control advocates thus believe that much of the industry’s marketing effort is directed toward attracting new smokers, primarily children but also groups of first- and second-generation American adults not yet fully acculturated into American society (e.g., Hispanic females, who have a very low smoking prevalence). Similarly, the introduction of aggressive Western cigarette marketing into Asian societies and Eastern European and African countries has been attacked by the public health community as spurring the growth of smoking among children and other traditionally nonsmoking segments of society (e.g., Asian females). In many of these countries, cigarette advertising was nonexistent or restricted to modest use of print media prior to the introduction of advertising by the major multinational tobacco companies. Before dissolution of the Soviet Union, there was virtually no advertising of the state-produced cigarettes in the Eastern European countries. That ended with a flourish when multinational tobacco companies entered the newly liberated economies and, in some instances, took over cigarette production from the inefficient state-run enterprises. In Japan, the state tobacco monopoly had never bothered to advertise on television prior to the entry of the multinationals. In short order, competition from the advertising of Western cigarette brands led to cigarettes becoming the second most advertised product on Japanese television. In Africa, the advertising media portray smoking as the indulgence of the much admired affluent set, with billboards in desperately poor villages depicting highsociety Africans smoking and smiling in convivial social settings (McGinn 1997). Although many tobacco control advocates find advertising a convenient villain to attack, widespread smoking has often preceded formal marketing. Tobacco smoking spread like wildfire through much of Europe in the sixteenth and seventeenth centuries. Similarly, male smoking rates in Russia and the Eastern European countries were very high prior to the fall of Communism and the advent of modern cigarette marketing. The effects of advertising are further examined below in Sect. 5.
4. The Art and Science of Smoking Cessation In countries in which the hazards of smoking have been well publicized, surveys find that most smokers
would like to quit and most have made at least one serious quit attempt, yet relatively few of those who try to quit succeed on any given attempt. In the US, for example, approximately three-quarters of smokers report they want to stop and as many as a third try each year. Yet only 2.5–3 percent succeed in quitting each year. Combined with persistent cessation efforts by many smokers, this modest quit rate has created a population in the US in which there are as many former smokers as current smokers (US Department of Health and Human Services 1989, 1990). In the aggregate, thus, quitting has significantly reduced the toll of smoking, but the toll remains stubbornly high due to the difficulty of quitting. Although most former smokers quit without the aid of formal programs or products, the widespread desire to quit, paired with its difficulty, has created a small but thriving market for smoking cessation. Formal cessation interventions range from the distribution of how-to-quit booklets, to mass media cessation campaigns, to individual and group counseling programs, to self-administered nicotine replacement therapy (NRT) products, to use of cessation pharmaceuticals combined with clinical counseling. The efficacy of interventions ranges from sustained quit rates of about 5–7 percent for the most general and least resource-intensive interventions (e.g., generic cessation booklets) to 30 percent or more for the most resource-intensive programs that combine sophisticated counseling with the use of pharmaceuticals (Agency for Health Care Policy and Research 1996). Over the years, behavioral scientists have helped refine smoking cessation techniques by evaluating interventions and developing theory applied to smoking cessation. Pre-eminent in the domain of theory have been models that characterize how smokers progress, through a series of ‘stages,’ to contemplate, attempt, and eventually succeed or fail in quitting, with cessation maintenance also examined (Prochaska and DiClemente 1983). Relating cessation advice to smokers’ stages of readiness to change constitutes one way of ‘tailoring’ cessation messages. Another involves tying cessation advice to the specific motivations of individual smokers to quit and their specific concerns. For example, consider a smoker motivated to quit primarily by the high cost of smoking but also concerned about gaining weight. Armed with this knowledge, a cessation counselor can develop specific information on the financial savings the smoker can expect once he or she quits, while suggesting specific strategies to avoid weight gain (or to deal with it if it occurs). Tailored cessation messages have the obvious virtue of meeting the idiosyncratic needs of individual smokers. In the absence of computer technology, however, they would entail the substantial cost of collecting information on individuals’ needs and concerns and then developing individualized (tailored) cessation advice for them. Computers, however, per14197
Smoking and Health mit simple and inexpensive collection and translation of information into tailored cessation materials, such as individualized advice booklets and calendars with specific daily advice and reminders (Strecher 1999). Health informatn kiosks placed in malls and other public locations can provide instant tailored suggestions to help people quit smoking. The concept of tailoring holds great potential to enhance quit rates both in individual counseling situations and in lowcost nontreatment settings, such as these kiosks. Social and behavioral scientists have also played a central role in defining appropriate medical treatment of smokers. The US Agency for Health Care Policy and Research (1996) clinical guideline for smoking cessation was produced by an expert committee including social and behavioral scientists who had worked on smoking cessation as service providers or developers or evaluators of interventions. The guideline urges physicians to regularly counsel their smoking patients about the implications of the behavior and to encourage them to quit. It concludes that a highly effective cessation approach involves physician counseling to quit, supplemented with use of cessation pharmaceuticals and maintenance advice through follow-up contacts. A cost-effectiveness analysis of the guideline added economics to the social sciences contributing to understanding optimal smoking cessation therapy (Cromwell et al. 1997). Despite substantial improvements in the efficacy of cessation treatments, in any given year only a small fraction of the smokers who say they want to quit succeed in doing so, and only a small fraction of these have employed professional or programmatic assistance. As such, in any short-run period of time, the contribution of smoking cessation interventions to reducing the health toll of smoking is modest at best. This source of frustration has led a subset of smoking cessation professionals to explore methods of achieving ‘harm reduction’ that do not depend on smokers completely overcoming their addictions to nicotine. Harm reduction techniques range from helping smokers to reduce their daily consumption of cigarettes to encouraging consideration of the long-term use of low-risk nicotine-delivery devices, such as nicotine ‘gum’ or the patch. Though fraught with problems, harm reduction may be an idea whose time has come (Warner 2001).
5. Analysis of the Effects of Tobacco Control Policies Another way to grapple with the toll created by smoking is to develop policies that discourage the initiation or continuation of smoking, or that restrict it to areas in which it will not impose risk on nonsmokers. Social and behavioral scientists have devoted substantial effort to studying the impacts of 14198
policies, as well as the processes by which such policies come to be adopted. This section examines the former. (For the latter, see, e.g., Fritschler and Hoefler 1996, Kagan and Vogel 1993.) The World Health Organization and the World Bank have described and evaluated a wide array of tobacco control policies in countries around the globe (Roemer 1993, Prabhat and Chaloupka 1999). Three that have commanded the greatest amount of research attention are taxation, restricting or banning advertising and promotion, and limiting smoking in public places.
5.1 Taxation Research performed primarily by economists has established that taxing cigarettes is among the most effective measures to decrease smoking (Chaloupka and Warner 2000, Prabhat and Chaloupka 1999). Because the quantity of cigarettes consumed declines by an amount proportionately smaller than the associated tax rise, taxation increases government revenues at the same time that it decreases smoking. In developed countries, economists find that a 10 percent increase in cigarette price induces approximately a 4 percent decrease in quantity demanded. In developing countries, the demand impact may be twice as large. Effective dissemination of research findings has made taxation one of the central tenets of a comprehensive tobacco control program. Although the ‘bottom line’ about taxation is well established, research by economists and others points to concerns that remain unanswered. For example, does taxation discourage the initiation of smoking? Although evidence preponderantly suggests that it does, recent studies challenge the conventional view (Chaloupka and Warner 2000, Chaloupka 1999). ‘Side effects,’ or subtle unanticipated impacts of taxation, warrant additional attention as well. Notably, cigarette tax increases in the US cause some smokers to switch to higher nicotine cigarettes to get their customary dose of nicotine from fewer cigarettes, particularly among younger smokers (Evans and Farrelly 1998). In the emerging field of ‘behavioral economics,’ a small cadre of psychologists is studying experimentally how smoking responds to incentives. For example, investigators give subjects a daily ‘budget’ (e.g., play money) with which to buy cigarettes, food, and other commodities, at prices established by the investigators, and evaluate the effects of ‘taxation’ by raising the price of cigarettes. Findings generally have been quite consistent with those in the mainstream economics literature (Chaloupka et al. 1999). Behavioral economics can investigate questions not addressable using existing real world data, but also has decided limitations in predicting how people will
Smoking and Health respond to price changes in that real world. By learning from each other, the two fields can enrich the evidentiary base for future tobacco tax policy.
5.2 Adertising A diverse group of sociologists, psychologists, economists, marketing specialists, and public health scholars has studied the effects of marketing on smoking. Research by several scholars concludes that modern Western-style advertizing creates an imagery that many people, including large proportions of children, find easily recognizable and attractive. There is a strong correlation between children’s interest in cigarette marketing campaigns and their subsequent smoking behavior. Temporally, advertising campaigns directed at specific segments of a population have often been followed by significant growth in smoking in the targeted groups, including women in the US in the 1960s, young women in Asia in the 1990s, and children in many countries (Warner 2000 McGinn 1997). Whether any of these associations constitutes a causal relationship is the essential question. The same unidentified factor that makes cigarette advertising attractive to certain children could account for their subsequent smoking, independent of the advertising per se. Similarly, cigarette marketers might foresee an (independently occurring) expansion of a market segment and dive into the advertising void to compete for shares of that new market. In the absence of the ability to run randomized controlled trials, empirical analysis of the relationship between cigarette advertising and consumption has been unable to prove a causal connection or to estimate its likely extent (US Department of Health and Human Services 1989, Warner 2000). However, strong new evidence supporting causality comes from a study re-examining data on the relationship between countries’ policies with regard to cigarette advertising (ranging from no restrictions to complete bans) and levels of smoking within those societies (Saffer and Chaloupka 1999). Blending marketing theory with empirical analysis, this study concluded that a complete ban on cigarette advertising would decrease smoking by about 6 percent, while partial bans (e.g., banning cigarette ads on the broadcast media) would be unlikely to have an impact on cigarette consumption. Combined with the previously existing evidence, the new research leads to the most plausible interpretation of the relationship between advertising and cigarette consumption. It is a conclusion that likely will satisfy neither tobacco industry defenders of the ‘brand-share only’ argument, nor tobacco control advocates who condemn marketing efforts as a principal cause of smoking. Cigarette advertising and other forms of
marketing likely do increase the amount of smoking in a statistically significant manner, possibly accounting for as much as 10 percent of cigarette consumption. (This figure is likely to vary among societies, depending on the maturity of the smoking market and on familiarity with large-scale marketing campaigns.) The converse, of course, is that advertising and marketing almost certainly do not account for the majority of cigarettes consumed. For these, one must turn to other influences, all less policy tractable than advertising, including role modeling, peer behavior, and the addictiveness of nicotine and smoking.
5.3 Restrictions on Smoking in Public Places ‘Clean indoor air laws,’ which restrict or prohibit smoking in public places and workplaces, grew out of concerns that the exposure of nonsmokers to environmental tobacco smoke (ETS) could create a risk of disease, leading in the US to rapid diffusion of state laws, beginning in 1973. A decade later, similar laws emerged at the local level of government, where most of the legislative action in the US has remained since then (Brownson et al. 1997). The scientific knowledge base actually followed early diffusion of legislation, with a number of studies of the relationship between ETS exposure and the risk of lung cancer published in the 1980s (Brownson et al. 1997). By the 1990s, the research base had become sufficiently strong that the US Environmental Protection Agency (EPA) declared ETS a ‘Class A Carcinogen,’ a proved environmental cause of cancer in nonsmoking humans. The EPA estimated that ETS caused approximately 3,000 lung cancer cases annually in the US, and also detailed nonfatal respiratory disease effects of ETS, particularly in children (US Environmental Protection Agency 1992). More recently, research has implicated ETS in heart disease deaths in adult nonsmokers, possibly an order of magnitude greater than the lung cancer impact (American Council on Science and Health 1999). Social and behavioral scientists have informed the debate by studying the process of the emergence and diffusion of clean indoor air laws and, subsequently, the effects of such laws on nonsmokers’ exposure to ETS and on smokers’ smoking behavior. Studies have found generally high levels of compliance with the laws, even in bars (where low compliance was anticipated by many) (Warner 2000), and reductions in employees’ exposure to ETS, as measured both by self-reports, air sampling, and examination of employees’ body burdens of cotinine, a nicotine derivative. Less obvious is whether clean indoor air laws discourage smoking or merely ‘reallocate’ it to times and places in which smoking is permitted. According to several studies, the laws do discourage smoking 14199
Smoking and Health among workers in regulated workplaces, producing increased quit rates and lower daily consumption among continuing smokers (Brownson et al. 1997).
5.4 Interaction of Policies As policy research becomes more sophisticated, the relationships among policies and their impacts will come to be better understood. Illustrative is research examining the joint effects of tax increases and the adoption of clean indoor air laws. For example, some of the smoking decline credited to tax increases might be associated with a citizenry more interested in reducing smoking. In turn, the latter could be reflected in the adoption of clean indoor air laws. Examining the joint effects of the two policies confirmed researchers’ suspicions, thereby suggesting a reduced (but still quite substantial) impact of taxation on smoking. Similarly, disentangling the impacts of statelevel tobacco control programs that mix tax increases with media antismoking campaigns is important but analytically challenging. Recent research paves the way toward better evaluation of multiple-component policies (Chaloupka 1999, Chaloupka and Warner 2000).
6. The Future of Social and Behaioral Science Contributions to Tobacco Control Social and behavioral science have contributed greatly toward understanding how and why the epidemic of smoking has evolved throughout the twentieth century. They have also elucidated a set of tools that can help society to extricate itself from the tenacious grip of this public health disaster. With innovations in information and pharmacological technology combining with better insights into human behavior, further improvements in assisting smokers to quit appear to be virtually certain. Although the long-range goal must focus on preventing future generations of children from starting to smoke, helping current adult smokers to quit remains critical to reducing the tobacco-attributable burden of disease and mortality over the next three decades. As such, the role of social and behavioral scientists in smoking cessation will likely be increasingly productive. The effectiveness of efforts to encourage smokers to quit depends to a significant degree on the environment in which smoking occurs. If smoking is increasingly viewed as antisocial, quitting smoking will become easier, and more urgent, for current smokers. Policy making is society’s best means of intentionally altering the environment; and better understanding of the effects of policy interventions, and of how they come to be adopted, will be crucial to shaping the 14200
environment with regard to smoking in the coming years. The multidimensional problems associated with tobacco use vividly illustrate the need for scientists of all disciplinary persuasions to work together. The next great challenge confronting the diverse fields of social and behavioral science in tobacco control is how to combine the methods and insights of the various disciplines into an integrated whole (Chaloupka 1999). See also: Drug Addiction; Drug Addiction: Sociological Aspects; Drug Use and Abuse: Cultural Concerns; Drug Use and Abuse: Psychosocial Aspects; Smoking Prevention and Smoking Cessation; Substance Abuse in Adolescents, Prevention of
Bibliography Agency for Health Care Policy and Research 1996 Smoking Cessation: Clinical Practice Guideline, No. 18, Information for Specialists. US Department of Health and Human Services, Public Health Service, Agency for Health Care Policy and Research (AHCPR Publication No. 96-0694), Rockville, MD American Council on Science and Health 1999 Enironmental Tobacco Smoke: Health Risk or Health Hype? American Council on Science and Health, New York Brownson R C, Eriksen M P, Davis R M, Warner K E 1997 Environmental tobacco smoke: Health effects and policies to reduce exposure. Annual Reiew of Public Health 18: 163–85 Chaloupka F J 1999 Macro-social influences: The effects of prices and tobacco control policies on the demand for tobacco products. Nicotine & Tobacco Research 1(51): 105–9 Chaloupka F J, Grossman M, Bickel W K, Saffer H (eds.) 1999 The Economic Analysis of Substance Use and Abuse: An Integration of Econometric and Behaioral Economic Research. University of Chicago Press, Chicago Chaloupka F J, Laixuthai A 1996 US trade policy and cigarette smoking in Asia. National Bureau of Economic Research Working Paper No. 5543. NBER, Cambridge, MA Chaloupka F J, Warner K E 2000 The economics of smoking. In: Culyer A J, Newhouse J P (eds.) Handbook of Health Economics. Elsevier, Amsterdam Cromwell J, Bartosch W J, Fiore M C, Hasselblad V, Baker T 1997 Cost-effectiveness of the clinical practice recommendations in the AHCPR guideline for smoking cessation. Journal of the American Medical Association 278: 1759–66 Evans W N, Farrelly M C 1998 The compensating behavior of smokers: taxes, tar, and nicotine. RAND Journal of Economics 29: 578–95 Fritschler A L, Hoefler J M 1996 Smoking and Politics: Policy Making and the Federal Bureaucracy, 5th edn. Prentice Hall, Upper Saddle River, NJ Goodman J 1993 Tobacco in History: The Cultures of Dependence. Routledge, New York Kagan R A, Vogel D 1993 The politics of smoking regulation: Canada, France, and the United States. In: Rabin R L, Sugarman S D (eds.) Smoking Policy: Law, Politics, & Culture. Oxford University Press, New York Lantz P M, Jacobson P D, Warner K E, Wasserman J, Pollack H A, Berson J, Ahlstrom A 2000 Investing in youth
Smoking Preention and Smoking Cessation tobacco control: a review of smoking prevention and control strategies. Tobacco Control 9: 47–63 McGinn A P 1997 The nicotine cartel. World Watch 10(4): 18–27 Prabhat J, Chaloupka F J 1999 Curbing the Epidemic: Goernments and the Economics of Tobacco Control. World Bank, Washington, DC Prochaska J O, DiClemente C C 1983 Stages and processes of self-change of smoking: toward an integrative model of change. Journal of Consulting and Clinical Psychology 51: 390–5 Roemer R 1993 Legislatie Action to Combat the World Tobacco Epidemic, 2nd edn. World Health Organization, Geneva, Switzerland Royal College of Physicians of London 1962 Smoking and Health: A Report on Smoking in Relation to Cancer of the Lung and other Diseases. Pitman Publishing Co., London Saffer H, Chaloupka F J 1999 Tobacco advertising: economic theory and international evidence. National Bureau of Economic Research Working Paper No. 6958. NBER, Cambridge, MA Strecher V J 1999 Computer-tailored smoking cessation materials: A review and discussion. Patient Education and Counseling 36: 107–17 US Department of Health and Human Services 1988 The Health Consequences of Smoking: Nicotine Addiction. (DHHS Publication No. (CDC) 88-8406), Government Printing Office, Washington, DC US Department of Health and Human Services 1989 Reducing the Health Consequences of Smoking: 25 Years of Progress. A Report of the Surgeon General. US Department of Health and Human Services, Public Health Service, Centers for Disease control, National Center for Chronic Disease Prevention and Health Promotion, Office of Smoking and Health (DHHS Publication No. (CDC) 89-8411), Rockville, MD US Department of Health and Human Services 1990 The Health Benefits of Smoking Cessation. A Report of the Surgeon General. US Department of Health and Human Services, Public Health Service, Centers for Disease Control, National Center for Chronic Disease Prevention and Health Promotion, Office of Smoking and Health (DHHS Publication NO. (CDC) 90-8416), Rockville, MD US Department of Health and Human Services 1994 Preenting Tobacco Use Among Young People: A Report of the Surgeon General. US Department of Health and Human Services, Public Health Service, Centers for Disease Control, National Center for Chronic Disease Prevention and Health Promotion, Office on Smoking and Health. US Government Printing Office, Washington, DC US Environmental Protection Agency 1992 Respiratory Health Effects of Passie Smoking: Lung Cancer and Other Disorders. US Environmental Protection Agency, Washington, DC US Public Health Service 1964 Smoking and Health. Report of the Adisory Committee to the Surgeon General of the Public Health Serice. US Department of Health, Education, and Welfare, Public Health Service, Center for Disease Control. PHS Publication No. 1103, Washington, DC Warner K E 2000 The economics of tobacco: Myths and realities. Tobacco Control 9: 78–89 Warner K E 2001 Reducing harms to smokers: Methods, their effectiveness, and the role of policy. In: Rabin R L, Sugarman S D (eds.) Regulating Tobacco: Premises and Policy Options. Oxford University Press, New York World Health Organization 1997 Tobacco or Health: A Global Status Report. World Health Organization, Geneva, Switzerland
Wynder E L, Graham E A 1950 Journal of the American Medical Association 143: 329–96
K. E. Warner
Smoking Prevention and Smoking Cessation Cigarettes became popular after 1915. Today, approximately 1.1 billion people aged 15 years and over smoke: about one-third of the global population (see Smoking and Health). This article provides an overview of health effects, and methods for smoking prevention and cessation.
1. Health Effects In the late 1940s, epidemiologists noticed that annual death rates due to lung cancer had increased fifteenfold between 1922 and 1947 in several countries. Since the middle of the twentieth century, tobacco products contributed to more than 60 million deaths in developed countries. The estimated annual mortality is 540,000 in the European Union, 461,000 in the USA, and 457,000 in the former USSR (Peto et al. 1994). Tobacco is held responsible for three and a half million deaths worldwide yearly: about seven percent of all deaths per year. This figure will grow to ten million deaths per year by the 2020s; about 18 percent of all deaths in developed countries and 11 percent of all deaths in developing countries. Half a billion people now alive will then be killed by tobacco products; these products will have killed more people than any other single disease (WHO 1998). More than 40 chemicals in tobacco smoke cause cancer. Tobacco is a known or probable cause of about 25 diseases. Tobacco is recognized as the most important cause of lung cancer, but it kills even more people through many other diseases, including cancers at other sites, heart disease, stroke, emphysema, and other chronic lung diseases. Smokeless tobacco and cigars also have deadly consequences, including lung, larynx, esophageal, and oral cancer (USDHHS 1994). Lifetime smokers have a 50 percent chance of dying from tobacco. Half of these will die in middle age, before age 70, losing 22 years of normal life expectancy. Exposure to environmental tobacco smoke (ETS) has been found to be an established cause of lung cancer, ischemic heart disease, and chronic respiratory disease in adults. Reported effects for children are sudden infant death syndrome, bronchial hyperresponsiveness, atopy, asthma, respiratory diseases, 14201
Smoking Preention and Smoking Cessation reduced lung function, and middle ear disease. Barnes and Bero (1998) demonstrated from 106 reviews that conclusions on ETS were associated with the affiliations of the researchers. Overall, 37 percent concluded that passive smoking was not harmful to health. However, 74 percent of these were written by researchers with tobacco industry affiliations. The only factor associated with concluding that passive smoking is not harmful was whether the author was affiliated with the tobacco industry. Declining consumption in developed countries has been counterbalanced by increasing consumption in developing countries. Globally, 47 percent of men and 12 percent of women smoke. In developing countries 48 percent of men and 7 percent of women smoke, while in developed countries 42 percent of men smoke as do 24 percent of women. Tobacco use is regarded as the single most important public health issue in industrialized countries.
training. Life skills approaches focus on the training of generic life skills. Their effects are less strong than those of the social influence approaches. 2.2 Out-of-school Approaches Youngsters can also be reached out of school. Mass media approaches and nonsmoking clubs are popular methods. They attract attention to the subject and may influence attitudes. A review of 63 studies about the effectiveness of mass media found small effects on behavior (Sowden and Arblaster 1999). Targeting smoking parents is important as well; children are almost three times as likely to smoke if their parents do. Helping parents to quit smoking may prevent their children from starting to smoke and may encourage adolescent cessation. 2.3 Policies
2. Smoking Preention The process of becoming a smoker can be divided into several stages: preparation (never smoking), initial smoking, experimental or occasional (monthly) smoking, and regular (weekly and daily) smoking. In the preparatory stage attitudes towards smoking are formed. While at least 90 percent of the population has ever smoked a cigarette, the likelihood of becoming a regular smoker increases if initial smoking is repeated several times. In the third stage, a child learns how to smoke and the perceived advantages of smoking start to outweigh the disadvantages. In the fourth stage smoking becomes a routine. Most onset of smoking takes place during adolescence between age 10 and 20. Smoking onset as well as cessation are influenced by a variety of cultural (e.g., availability, litigation, smoke free places), biological (addiction), demographic (e.g., socioeconomic status, parent education), social factors (e.g., parental and peer smoking, parental and peer pressure, social bonding), and psychological factors (e.g., self-esteem, attitudes, self-efficacy expectations). Various efforts have been undertaken to prevent youngsters to start to smoke (see e.g., Hansen 1992, Reid et al. 1995, USDHHS 1994). 2.1 School-based Preention Programs Three types of school programs can be distinguished (see Health Promotion in Schools). Knowledge programs were not effective. The social influence approaches result in reduced onset ranging from 25 to 60 percent; effects may persist up to four years. Long term effects are found with programs embedded in more comprehensive community approaches. The method includes five to ten lessons, emphasizes shortterm consequences of smoking, discusses social (mostly peer) pressures, and includes refusal skills 14202
A range of policy interventions can be used to stimulate the prevention of smoking. (a) Price policies can have preventive effects. Higher prices encourage cessation among current smokers and discourage initiation among young smokers. A price elasticity of k0.05 implies that a ten percent increase in price reduces consumption or prevalence by five percent. Price elasticity ranges between k0.20 to k0.50. (b) School policies can stimulate smoking prevention. When examining the impact of school smoking policies over 4,000 adolescents in 23 Californian schools, it was found that schools with many smoking prevention policies had significantly lower smoking rates than schools with fewer policies and less emphasis on smoking prevention. (c) Public smoking restrictions can contribute to adolescents’ beliefs that nonsmoking is normative and that smoking creates health problems. Smoking regulations are more effective in preventing teenagers from starting to smoke than in reducing their consumption. Restriction of sales to minors often results in noncompliance: between 60 to 90 percent of the adolescents succeeded to buy tobacco products in situations where this was not allowed.
3. Smoking Cessation Physical addiction is caused by the pharmacological effects of nicotine. Psychological addiction occurs because smoking becomes linked with numerous daily circumstances and activities (e.g., eating, drinking) and emotional and stressful events. A person becomes motivated to quit if he has a positive attitude, encounters social support, and has high self-efficacy expectations towards quitting (De Vries and Mudde 1998).
Smoking Preention and Smoking Cessation Cessation is a process: a smoker in precontemplation is not considering to quit within six months; a smoker in contemplation is, but not within a month; a smoker in preparation is considering to quit within a month; a person in action has quit recently; a person in maintenance has quit for more than six months (Velicer and Prochaska 1999). Three outcome measures can be used to assess smoking cessation: point prevalence (having smoked during the preceding seven days), prolonged abstinence (not smoked during six or twelve months), and continuous abstinence (not smoked at all since the time of intervention). In longitudinal experimental designs more smokers than quitters may drop out, thus resulting in too optimistic estimates of successes of treatments. To correct for this bias, dropouts are coded as smokers; this is referred to as the ‘intention to treat procedure.’ This procedure may, however, result in conservative effect estimates. The desirability of biochemical validation is still controversial. Misreporting seldom exceeds five percent. Detection of occasional smoking in youngsters is difficult and expensive. Biochemical validation of self-reports should be considered when high demand situations are involved. A random subsample can be used to estimate bias and correct reported cessation rates. Cotinine has emerged as the measure of choice, with carbon monoxide as a cheaper but less sensitive alternative (Velicer et al. 1992). Comparing the results of cessation studies is hindered by differences in outcome measures, populations (e.g., the percentage of precontemplators), and follow-up periods. This overview includes methods that have evidence for success. The numerous studies assessing and reviewing the efficacy of smoking cessation interventions provide different success rates. Hence, the figures reported below are estimates derived from various studies that are reported below, as well as from inspection of other publications. 3.1 Pharmacotherapy Pharmacotherapeutic interventions increase quit rates approximately 1.5- to 2-fold. The absolute probability of not smoking at 6–12 months is greater when additional high intensity support is provided (Hughes et al. 1999, Silagy et al. 1999). Nicotine Replacement Therapy (NRT) products are available in a number of forms, including gum, transdermal patch, nasal spray, lozenge, and inhaler. NRT is recommended to be part of the core treatment package offered to all smokers. There are few instances in which the use of NRT is contraindicated. A meta analysis including 49 trials found significant effects when applying gum, patches, nasal spray, inhaled nicotine, and sublingual tablet. These effects were largely independent of the intensity of additional support provided or the setting in which the NRT was offered. The efficacy of NRT appears also to be largely
independent of other elements of treatment although absolute success rates are higher with more intensive behavioral support. The NRT cessation effects range from 6 to 35 percent, mostly ranging between 15 and 25 percent. Bupropion is an atypical antidepressant that has both dopaminergic and adrenergic actions. Unlike NRT, smokers begin bupropion treatment one week prior to cessation. The recent evidence shows that cessation rates are comparable to those by NRT, between 15 and 25 percent (Hurt et al. 1997, Jorenby et al. 1999). One study found that the nicotine patch was less effective than bupropion (Jorenby et al. 1999). The drug appears to work equally well in smokers with and without a past history of depression.
3.2 Motiational Strategies Various studies report on the effectiveness of group courses, self-help materials, computer tailoring, and competitions (e.g., Eriksen and Gottlieb 1998, Fisher et al. 1993, Matson et al. 1993, Skaar et al. 1997, Strecher 1999, Velicer and Prochaska 1999). Group courses can be effective. Studies suggest 10 to 30 percent abstinence rates. The disadvantage is that they attract mostly smokers who are highly motivated to quit, and that many smokers want to quit on their own. Self-help cessation interventions use different formats such as brochures, cassettes, and self-help guides. Point prevalence quit rates at one-year follow-up range between 9 and 15 percent. Competitions were effective in three out of five workplace studies. No study showed enhanced smoking cessation past six months. Net effects for cessation rate of competition plus group cessation program over group program alone in three studies varied between 1 to 25 percent. Computer tailoring results in personalized materials for smokers which are based on the (cognitive) characteristics of a person that were measured by a questionnaire. Large segments of smokers can be reached (De Vries and Brug 1999). A review of ten randomized trials found significant effects up to 25–30 percent.
3.3 Policy Interentions Policy interventions can effect cessation (Cummings et al. 1997, Eriksen and Gottlieb 1998, Glantz et al. 1997, Meier and Licari 1997, Zimring and Nelson 1995). Price policies, as with adolescents, influence cessation rates. Price elasticity rates for adults range between k0.3 and k0.55. Increases in cigarette prices may place a greater burden on those with lower incomes who tend to have greater difficulty in stopping smoking. 14203
Smoking Preention and Smoking Cessation Smoke-free areas can be effective as well, although evidence is not unequivocal. Smoking bans in workplaces resulted in reduced tobacco consumption or cessation at work, results ranging from 12 percent to 39 percent. The findings on reductions on prevalence were not consistent. Advertising, including tobacco sponsorship of events, is believed to stimulate the uptake of smoking and to reinforce the habit. This finding was supported by results from the COMMIT trial. The use of health warnings, however, has been found to be able to reduce tobacco consumption. Litigation by states and other groups may change the way tobacco is advertised and sold.
3.4 Special Settings for Cessation Multiple smoking cessation interventions can be applied in specific settings. The goal is to attract large segments of smokers, or particular segments of smokers. Advice by health intermediaries (physicians, nurses, midwives) is effective. A review describing the efficacy of 188 randomized controlled trials reported a two percent cessation rate resulting from a single routine consultation by physicians. While modest, the results were cost-effective. The effects are most salient among special risk groups, such as pregnant women and patients with cardiac diseases, although the additional cessation rates range from approximately 5 to 30 percent (Law and Tang 1995). In workplace settings cessation rates were four percent for a simple warning, eight percent for short counseling. They were highest among high-risk groups: 24 percent among workers at high risk for cardiovascular problems and 29 percent among asbestos exposed workers (Eriksen and Gottlieb 1998). Community interventions have the potential to reach all segments of a community. They combine different methods, such as mass media approaches, counseling, and self-help materials. Both the broader cardiovascular risk reduction interventions and those focusing solely on smoking had very small effects (1–3 percent cessation rates), although somewhat larger effects may be reached when combining them with mass media approaches (Fisher et al. 1993, Velicer and Prochaska 1999). Workplace programs also often combine methods. They can reach large and diverse groups of smokers and may produce an average long-term quit rate of 13 percent (ranging from 1.5 percent to 37 percent) at an average of 12 month’s follow-up regardless of the intervention methods (Eriksen and Gottlieb 1998). Group programs were more effective than minimal programs, although less intensive treatments when combined with high participation rates can influence the total population as well. 14204
4. Conclusions School-based smoking prevention can prevent or delay smoking onset. It is unrealistic to expect long-term effects with short programs when prosmoking norms are communicated through multiple channels. Consequently, at least ten lessons are needed during adolescence in several grades. Outside of school, programs are needed, since adolescents with the highest rates of tobacco use are least likely to be reached through school based programs (Stanton et al. 1995). Innovative cessation interventions are needed as well, because youth cessation programs show no long-term success. Since multiple interventions are found to be most effective, priority should be given to broad-based interventions aimed at both the youngsters, the schools, the parents, and the community as a whole, including